jsonpath
is a tool used to parse json data, similar to xpath
, jsonpath can parse very complex json data.
PostgreSQL json development history.
postgreSQL has supported json data type since 9.2, but due to the poor performance of parsing json data, it is not favored by everyone, but choose to use nosql database instead. So from pg9.4 onwards, the jsonb data type was supported. Compared to the json type, jsonb does not need to be parsed every time it is used, so the performance is much improved, and it also supports index queries and so on.
And from pg12 onwards the support for json is even more powerful: sql 2016 has 15 sql/json standards, PG 12 implements 14, far more than oracle (18c 11/15), mysql (8.0.4 5/15), sqlserver (2017 2/15) the latest version.
At the same time in pg12 introduced the jsonpath type, and a series of related functions, making the query performance of json data further, and more and more powerful.
JSONPATH syntax.
The syntax of the JSONpath function expression is as follows.
- The dot . Indicates an element that references Json data
- Square brackets [] indicate a reference to an array element
- Subscripts for array elements in Json data start from 0
The variables in JSONpath are as follows.
- The $ symbol indicates the variable of the Json text to be queried
- $varname indicates the specified variable
- @ refers to the variable that represents the current path element in the filter expression
Example of JSONPATH usage.
Simple query:
1
2
3
4
5
|
bill@bill=>SELECT jsonb_path_query_array('[1,2,3,4,5]', '$[*] ? (@ > 3)');
jsonb_path_query_array
------------------------
[4, 5]
(1 row)
|
Create test tables:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
CREATE TABLE house(js jsonb);
INSERT INTO house VALUES
('{
'address': {
'city':'Moscow',
'street': 'Ulyanova, 7A'
},
'lift': false,
'floor': [
{
'level': 1,
'apt': [
{'no': 1, 'area': 40, 'rooms': 1},
{'no': 2, 'area': 80, 'rooms': 3},
{'no': 3, 'area': 50, 'rooms': 2}
]
},
{
'level': 2,
'apt': [
{'no': 4, 'area': 100, 'rooms': 3},
{'no': 5, 'area': 60, 'rooms': 2}
]
}
]
}');
|
Query:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
bill@bill=>select jsonb_pretty(js) from house ;
jsonb_pretty
----------------------------------
{ +
'lift': false, +
'floor': [ +
{ +
'apt': [ +
{ +
'no': 1, +
'area': 40, +
'rooms': 1 +
}, +
{ +
'no': 2, +
'area': 80, +
'rooms': 3 +
}, +
{ +
'no': 3, +
'area': 50, +
'rooms': 2 +
} +
], +
'level': 1 +
}, +
{ +
'apt': [ +
{ +
'no': 4, +
'area': 100,+
'rooms': 3 +
}, +
{ +
'no': 5, +
'area': 60, +
'rooms': 2 +
} +
], +
'level': 2 +
} +
], +
'address': { +
'city': 'Moscow', +
'street': 'Ulyanova, 7A'+
} +
}
(1 row)
|
The hierarchy of this data is shown in the following figure.
It looks like a complex data hierarchy, but in practice it can be much more complex than that, so let’s see how to use jsonpath for querying.
1
2
3
4
5
|
bill@bill=>SELECT jsonb_path_query_array(js, '$.floor[0, 1].apt[1 to last]') FROM house;
jsonb_path_query_array
-----------------------------------------------------------------------------------------------------------
[{'no': 2, 'area': 80, 'rooms': 3}, {'no': 3, 'area': 50, 'rooms': 2}, {'no': 5, 'area': 60, 'rooms': 2}]
(1 row)
|
And if we don’t use jsonpath, we may need to write it like this.
1
2
3
4
5
6
7
8
9
|
bill@bill=>SELECT jsonb_agg(apt) FROM (
bill(# SELECT apt->generate_series(1, jsonb_array_length(apt) - 1) FROM (
bill(# SELECT js->'floor'->unnest(array[0, 1])->'apt' FROM house
bill(# ) apts(apt)
bill(# ) apts(apt);
jsonb_agg
-----------------------------------------------------------------------------------------------------------
[{'no': 2, 'area': 80, 'rooms': 3}, {'no': 3, 'area': 50, 'rooms': 2}, {'no': 5, 'area': 60, 'rooms': 2}]
(1 row)
|
In contrast, it is much easier to use the jsonpath-related functions.
For example, if we need to determine whether json data contains a value, we can do this.
1
2
3
4
5
|
bill@bill=>SELECT jsonb_path_exists(js, '$.** ? (@ == 'Moscow')') FROM house;
jsonb_path_exists
-------------------
t
(1 row)
|
And what if you don’t use jsonpath?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
bill@bill=>WITH RECURSIVE t(value) AS (
bill(# SELECT * FROM house UNION ALL (
bill(# SELECT COALESCE(kv.value, e.value) AS value
bill(# FROM t
bill(# LEFT JOIN LATERAL jsonb_each (
bill(# CASE WHEN jsonb_typeof(t.value) = 'object' THEN t.value
bill(# ELSE NULL END
bill(# ) kv ON true
bill(# LEFT JOIN LATERAL jsonb_array_elements (
bill(# CASE WHEN jsonb_typeof(t.value) = 'array' THEN t.value
bill(# ELSE NULL END
bill(# ) e ON true
bill(# WHERE kv.value IS NOT NULL OR e.value IS NOT NULL
bill(# )
bill(# ) SELECT EXISTS (SELECT 1 FROM t WHERE value = ''Moscow'');
exists
--------
t
(1 row)
|
So how to use jsonpath for data filtering? Take the above table as an example, we query the data with apt.no greater than 3.
1
2
3
4
5
6
|
bill@bill=>SELECT jsonb_path_query(js, '$.floor.apt.no ? (@>3)') FROM house;
jsonb_path_query
------------------
4
5
(2 rows)
|
Similarly, jsonpath supports the use of indexes.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
bill@bill=>CREATE INDEX ON house USING gin (js);
CREATE INDEX
bill@bill=>SET ENABLE_SEQSCAN TO OFF;
SET
bill@bill=>EXPLAIN (COSTS OFF) SELECT * FROM house
bill-# WHERE js @? '$.floor[*].apt[*] ? (@.rooms == 3)';
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on house
Recheck Cond: (js @? '$.'floor'[*].'apt'[*]?(@.'rooms' == 3)'::jsonpath)
-> Bitmap Index Scan on house_js_idx
Index Cond: (js @? '$.'floor'[*].'apt'[*]?(@.'rooms' == 3)'::jsonpath)
(4 rows)
|
In addition, jsonpath supports more than 20 kinds of related functions, is not very powerful, hurry up and use it!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
|
postgres=# \df *.*json*path*
List of functions
Schema | Name | Result data type | Argument data types
| Type
------------+------------------------------+------------------+------------------------------------------------------------------------------------------
-+------
pg_catalog | gin_consistent_jsonb_path | boolean | internal, smallint, jsonb, integer, internal, internal, internal, internal
| func
pg_catalog | gin_extract_jsonb_path | internal | jsonb, internal, internal
| func
pg_catalog | gin_extract_jsonb_query_path | internal | jsonb, internal, smallint, internal, internal, internal, internal
| func
pg_catalog | gin_triconsistent_jsonb_path | 'char' | internal, smallint, jsonb, integer, internal, internal, internal
| func
pg_catalog | json_extract_path | json | from_json json, VARIADIC path_elems text[]
| func
pg_catalog | json_extract_path_text | text | from_json json, VARIADIC path_elems text[]
| func
pg_catalog | jsonb_delete_path | jsonb | jsonb, text[]
| func
pg_catalog | jsonb_extract_path | jsonb | from_json jsonb, VARIADIC path_elems text[]
| func
pg_catalog | jsonb_extract_path_text | text | from_json jsonb, VARIADIC path_elems text[]
| func
pg_catalog | jsonb_path_exists | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_exists_opr | boolean | jsonb, jsonpath
| func
pg_catalog | jsonb_path_exists_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_match | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_match_opr | boolean | jsonb, jsonpath
| func
pg_catalog | jsonb_path_match_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_array | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_array_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_first | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_first_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_tz | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonpath_in | jsonpath | cstring
| func
pg_catalog | jsonpath_out | cstring | jsonpath
| func
pg_catalog | jsonpath_recv | jsonpath | internal
| func
pg_catalog | jsonpath_send | bytea | jsonpath
| func
(25 rows)
|
Reference link:
Reference https://my.oschina.net/postgresqlchina/blog/5079754