Transformación de datos de entrada#
Ultima modificación: Mayo 24, 2022
Archivo de datos#
[1]:
!cat /opt/druid/quickstart/tutorial/transform-data.json
{"timestamp":"2018-01-01T07:01:35Z","animal":"octopus", "location":1, "number":100}
{"timestamp":"2018-01-01T05:01:35Z","animal":"mongoose", "location":2,"number":200}
{"timestamp":"2018-01-01T06:01:35Z","animal":"snake", "location":3, "number":300}
{"timestamp":"2018-01-01T01:01:35Z","animal":"lion", "location":4, "number":300}
Especificación#
[2]:
#
# Linea 33
# Linea 38
#
!cat /opt/druid/quickstart/tutorial/transform-index.json | nl
1 {
2 "type" : "index_parallel",
3 "spec" : {
4 "dataSchema" : {
5 "dataSource" : "transform-tutorial",
6 "timestampSpec": {
7 "column": "timestamp",
8 "format": "iso"
9 },
10 "dimensionsSpec" : {
11 "dimensions" : [
12 "animal",
13 { "name": "location", "type": "long" }
14 ]
15 },
16 "metricsSpec" : [
17 { "type" : "count", "name" : "count" },
18 { "type" : "longSum", "name" : "number", "fieldName" : "number" },
19 { "type" : "longSum", "name" : "triple-number", "fieldName" : "triple-number" }
20 ],
21 "granularitySpec" : {
22 "type" : "uniform",
23 "segmentGranularity" : "week",
24 "queryGranularity" : "minute",
25 "intervals" : ["2018-01-01/2018-01-03"],
26 "rollup" : true
27 },
28 "transformSpec": {
29 "transforms": [
30 {
31 "type": "expression",
32 "name": "animal",
33 "expression": "concat('super-', animal)"
34 },
35 {
36 "type": "expression",
37 "name": "triple-number",
38 "expression": "number * 3"
39 }
40 ],
41 "filter": {
42 "type":"or",
43 "fields": [
44 { "type": "selector", "dimension": "animal", "value": "super-mongoose" },
45 { "type": "selector", "dimension": "triple-number", "value": "300" },
46 { "type": "selector", "dimension": "location", "value": "3" }
47 ]
48 }
49 }
50 },
51 "ioConfig" : {
52 "type" : "index_parallel",
53 "inputSource" : {
54 "type" : "local",
55 "baseDir" : "quickstart/tutorial",
56 "filter" : "transform-data.json"
57 },
58 "inputFormat" : {
59 "type" : "json"
60 },
61 "appendToExisting" : false
62 },
63 "tuningConfig" : {
64 "type" : "index_parallel",
65 "maxRowsPerSegment" : 5000000,
66 "maxRowsInMemory" : 25000
67 }
68 }
69 }
[3]:
!post-index-task --file /opt/druid/quickstart/tutorial/transform-index.json --url http://localhost:8081
Beginning indexing data for transform-tutorial
Task started: index_parallel_transform-tutorial_aeceionl_2022-05-25T04:38:39.312Z
Task log: http://localhost:8081/druid/indexer/v1/task/index_parallel_transform-tutorial_aeceionl_2022-05-25T04:38:39.312Z/log
Task status: http://localhost:8081/druid/indexer/v1/task/index_parallel_transform-tutorial_aeceionl_2022-05-25T04:38:39.312Z/status
Task index_parallel_transform-tutorial_aeceionl_2022-05-25T04:38:39.312Z still running...
Task index_parallel_transform-tutorial_aeceionl_2022-05-25T04:38:39.312Z still running...
Task finished with status: SUCCESS
Completed indexing data for transform-tutorial. Now loading indexed data onto the cluster...
transform-tutorial is 0.0% finished loading...
transform-tutorial is 0.0% finished loading...
transform-tutorial is 0.0% finished loading...
transform-tutorial loading complete! You may now query your data
Consulta de los datos#
[4]:
!dsql -e 'select * from "transform-tutorial"'
┌──────────────────────────┬────────────────┬───────┬──────────┬────────┬───────────────┐
│ __time │ animal │ count │ location │ number │ triple-number │
├──────────────────────────┼────────────────┼───────┼──────────┼────────┼───────────────┤
│ 2018-01-01T05:01:00.000Z │ super-mongoose │ 1 │ 2 │ 200 │ 600 │
│ 2018-01-01T06:01:00.000Z │ super-snake │ 1 │ 3 │ 300 │ 900 │
│ 2018-01-01T07:01:00.000Z │ super-octopus │ 1 │ 1 │ 100 │ 300 │
└──────────────────────────┴────────────────┴───────┴──────────┴────────┴───────────────┘
Retrieved 3 rows in 0.02s.