开发者

How can I filter for a specific date on a CQL timestamp column?

开发者 https://www.devze.com 2022-12-07 22:03 出处:网络
I have a table defined as: CREATE TABLE downtime( asset_code text, down_start timestamp, down_end timestamp,

I have a table defined as:

CREATE TABLE downtime(
    asset_code text,
    down_start timestamp,
    down_end timestamp,
    down_duration duration,
    down_type text,
    down_reason text,
    PRIMARY KEY ((asset_code, down_start), down_end)
);

I'd like to get downtime on a particular day, such as:

SELECT * FROM downtime \
  WHERE asset_code = 'CA-PU-03-LB' \
  AND todate(down_start) = '2022-12-11';

I got a syntax error:

SyntaxException: line 1:66 no viable alternative at input '(' (...where asset_code = 开发者_Go百科'CA-PU-03-LB' and [todate](...)

If function is not allowed on a partition key in where clause, how can I get data with "down_start" of a particular day?


You don't need to use the TODATE() function to filter for a specific date. You can simply specify the date as '2022-12-11' when applying a filter on a CQL timestamp column.

But the difference is that you cannot use the equality operator (=) because the CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so you need to be precise when you're working with timestamps.

Let me illustrate using this example table:

CREATE TABLE tstamps (
    id int,
    tstamp timestamp,
    colour text,
    PRIMARY KEY (id, tstamp)
)

My table contains the following sample data:

cqlsh> SELECT * FROM tstamps ;

 id | tstamp                          | colour
----+---------------------------------+--------
  1 | 2022-12-05 11:25:01.000000+0000 |    red
  1 | 2022-12-06 02:45:04.564000+0000 | yellow
  1 | 2022-12-06 11:06:48.119000+0000 | orange
  1 | 2022-12-06 19:02:52.192000+0000 |  green
  1 | 2022-12-07 01:48:07.870000+0000 |   blue
  1 | 2022-12-07 03:13:27.313000+0000 | indigo

The cqlshi client formats the tstamp column into a human-readable date in UTC. But really, the tstamp values are stored as integers:

cqlsh> SELECT tstamp, TOUNIXTIMESTAMP(tstamp) FROM tstamps ;

 tstamp                          | system.tounixtimestamp(tstamp)
---------------------------------+--------------------------------
 2022-12-05 11:25:01.000000+0000 |                  1670239501000
 2022-12-06 02:45:04.564000+0000 |                  1670294704564
 2022-12-06 11:06:48.119000+0000 |                  1670324808119
 2022-12-06 19:02:52.192000+0000 |                  1670353372192
 2022-12-07 01:48:07.870000+0000 |                  1670377687870
 2022-12-07 03:13:27.313000+0000 |                  1670382807313

To retrieve the rows for a specific date, you need to specify the range of timestamps which fall on a specific date. For example, the timestamps for 6 Dec 2022 UTC ranges from 1670284800000 (2022-12-06 00:00:00.000 UTC) to 1670371199999 (2022-12-06 23:59:59.999 UTC).

This means if we want to query for December 6, we need to filter using a range query:

SELECT * FROM tstamps \
  WHERE id = 1 \
  AND tstamp >= '2022-12-06' \
  AND tstamp < '2022-12-07';

and we get:

 id | tstamp                          | colour
----+---------------------------------+--------
  1 | 2022-12-06 02:45:04.564000+0000 | yellow
  1 | 2022-12-06 11:06:48.119000+0000 | orange
  1 | 2022-12-06 19:02:52.192000+0000 |  green

WARNING - In your case where the timestamp column is part of the partition key, performing a range query is dangerous because it results in a multi-partition query -- there are 86M possible values between 1670284800000 and 1670371199999. For this reason, timestamps are not a good choice for partition keys. Cheers!


0

精彩评论

暂无评论...
验证码 换一张
取 消