Sumarización usando Roll-up#
Ultima modificación: Mayo 24, 2022
Archivo de datos#
[1]:
!cat /opt/druid/quickstart/tutorial/rollup-data.json
{"timestamp":"2018-01-01T01:01:35Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":20,"bytes":9024}
{"timestamp":"2018-01-01T01:01:51Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":255,"bytes":21133}
{"timestamp":"2018-01-01T01:01:59Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":11,"bytes":5780}
{"timestamp":"2018-01-01T01:02:14Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":38,"bytes":6289}
{"timestamp":"2018-01-01T01:02:29Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":377,"bytes":359971}
{"timestamp":"2018-01-01T01:03:29Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":49,"bytes":10204}
{"timestamp":"2018-01-02T21:33:14Z","srcIP":"7.7.7.7", "dstIP":"8.8.8.8","packets":38,"bytes":6289}
{"timestamp":"2018-01-02T21:33:45Z","srcIP":"7.7.7.7", "dstIP":"8.8.8.8","packets":123,"bytes":93999}
{"timestamp":"2018-01-02T21:35:45Z","srcIP":"7.7.7.7", "dstIP":"8.8.8.8","packets":12,"bytes":2818}
Especificación para la ingestión#
[2]:
#
# Note el `"rollup" : true` en la línea 33
#
!cat /opt/druid/quickstart/tutorial/rollup-index.json | nl
1 {
2 "type" : "index_parallel",
3 "spec" : {
4 "dataSchema" : {
5 "dataSource" : "rollup-tutorial",
6 "timestampSpec": {
7 "column": "timestamp",
8 "format": "iso"
9 },
10 "dimensionsSpec" : {
11 "dimensions" : [
12 "srcIP",
13 "dstIP"
14 ]
15 },
16 "metricsSpec" : [
17 { "type" : "count", "name" : "count" },
18 { "type" : "longSum", "name" : "packets", "fieldName" : "packets" },
19 { "type" : "longSum", "name" : "bytes", "fieldName" : "bytes" }
20 ],
21 "granularitySpec" : {
22 "type" : "uniform",
23 "segmentGranularity" : "week",
24 "queryGranularity" : "minute",
25 "intervals" : ["2018-01-01/2018-01-03"],
26 "rollup" : true
27 }
28 },
29 "ioConfig" : {
30 "type" : "index_parallel",
31 "inputSource" : {
32 "type" : "local",
33 "baseDir" : "quickstart/tutorial",
34 "filter" : "rollup-data.json"
35 },
36 "inputFormat" : {
37 "type" : "json"
38 },
39 "appendToExisting" : false
40 },
41 "tuningConfig" : {
42 "type" : "index_parallel",
43 "maxRowsPerSegment" : 5000000,
44 "maxRowsInMemory" : 25000
45 }
46 }
47 }
Ejecución de la ingestión#
[3]:
!post-index-task --file /opt/druid/quickstart/tutorial/rollup-index.json --url http://localhost:8081
Beginning indexing data for rollup-tutorial
Task started: index_parallel_rollup-tutorial_hfmlejkb_2022-05-25T04:36:37.719Z
Task log: http://localhost:8081/druid/indexer/v1/task/index_parallel_rollup-tutorial_hfmlejkb_2022-05-25T04:36:37.719Z/log
Task status: http://localhost:8081/druid/indexer/v1/task/index_parallel_rollup-tutorial_hfmlejkb_2022-05-25T04:36:37.719Z/status
Task index_parallel_rollup-tutorial_hfmlejkb_2022-05-25T04:36:37.719Z still running...
Task index_parallel_rollup-tutorial_hfmlejkb_2022-05-25T04:36:37.719Z still running...
Task finished with status: SUCCESS
Completed indexing data for rollup-tutorial. Now loading indexed data onto the cluster...
rollup-tutorial is 0.0% finished loading...
rollup-tutorial is 0.0% finished loading...
rollup-tutorial is 0.0% finished loading...
rollup-tutorial loading complete! You may now query your data
Datos ingestados#
[4]:
!dsql -e 'select * from "rollup-tutorial"'
┌──────────────────────────┬────────┬───────┬─────────┬─────────┬─────────┐
│ __time │ bytes │ count │ dstIP │ packets │ srcIP │
├──────────────────────────┼────────┼───────┼─────────┼─────────┼─────────┤
│ 2018-01-01T01:01:00.000Z │ 35937 │ 3 │ 2.2.2.2 │ 286 │ 1.1.1.1 │
│ 2018-01-01T01:02:00.000Z │ 366260 │ 2 │ 2.2.2.2 │ 415 │ 1.1.1.1 │
│ 2018-01-01T01:03:00.000Z │ 10204 │ 1 │ 2.2.2.2 │ 49 │ 1.1.1.1 │
│ 2018-01-02T21:33:00.000Z │ 100288 │ 2 │ 8.8.8.8 │ 161 │ 7.7.7.7 │
│ 2018-01-02T21:35:00.000Z │ 2818 │ 1 │ 8.8.8.8 │ 12 │ 7.7.7.7 │
└──────────────────────────┴────────┴───────┴─────────┴─────────┴─────────┘
Retrieved 5 rows in 0.13s.
[5]:
#
# Note que los primeros registros fueron agregados usando como dimensiones
# {timestamp, srcIP, dstIP}.
#
!dsql -e 'select * from "rollup-tutorial" limit 3'
┌──────────────────────────┬────────┬───────┬─────────┬─────────┬─────────┐
│ __time │ bytes │ count │ dstIP │ packets │ srcIP │
├──────────────────────────┼────────┼───────┼─────────┼─────────┼─────────┤
│ 2018-01-01T01:01:00.000Z │ 35937 │ 3 │ 2.2.2.2 │ 286 │ 1.1.1.1 │
│ 2018-01-01T01:02:00.000Z │ 366260 │ 2 │ 2.2.2.2 │ 415 │ 1.1.1.1 │
│ 2018-01-01T01:03:00.000Z │ 10204 │ 1 │ 2.2.2.2 │ 49 │ 1.1.1.1 │
└──────────────────────────┴────────┴───────┴─────────┴─────────┴─────────┘
Retrieved 3 rows in 0.06s.
[6]:
#
# Esto mismo pasó para los registros en 2018-01-01T01:02
#