Saving user-defined Views with computed columns in a database structure_问答_开发者

I have acquisition data from various sensors organized in db tables.

I give user the ability to create his own Data Views adding columns, one for each sensor, grouped by data, like this

 View #1
+---------------------+------+----+-----+
|      datetime       |  v1  | v2 |  v3 |
+---------------------+------+----+-----+
| 2010-09-13 00:05:00 | 40.9 |  1 | 0.3 |
| 2010-09-13 00:10:00 | 41.0 |  2 | 0.3 |
| 2010-09-13 00:15:00 | 41.1 |  4 | 0.3 |
+---------------------+------+----+-----+

To store this kind of Views I use two tables: views and view_opts

views
    id
    name

view_opts
    id
    id_view
    id_sensor
    ord #for column ordering

Now I have to go one step further: add more columns, not coming directly from sensor data but as a computation of other colums, like this:

 View #1
+---------------------+------+----+-----------+-----+
|      datetime       |  v1  | v2 | (v1+v2)/2 |  v3 |
+---------------------+------+----+-----------+-----+
| 2010-09-13 00:05:00 | 40.9 |  1 |   20.95   | 0.3 |
| 2010-09-13 00:10:00 | 41.0 |  2 |   21.5    | 0.3 |
| 2010-09-13 00:15:00 | 41.1 |  4 |   22.55   | 0.3 |
+---------------------+------+----+-----------+-----+

If this is not hard to achieve in SQL (I wrap the old SELECT in a new one that makes computation) I have to extend the way I store Views someway.

I thought at several ways, all with some pro and a lot of cons...

Any good idea?

================

Edit:

one of my thoughts was to add a text field to view-opts to let user add his own computations in SQL-like syntax:

(v1+v2)/2

in case sensor foreign keys are null in view_opt the text is used in query, something like:

SELECT v1, v2, (v1+v2)/2, v3 FROM (
    SELECT v1, v2, v3 FROM ...
)

Of course I should parse the (v1+v2)/2 part for syntax errors, injections (improbable, this functionality is only for trusted administration users that have no interest in damaging their own da开发者_如何学Pythonta).

===============

Add:

here are som DDL for key tables in my database (I trimmed some fields and tables not needed for this question):

CREATE TABLE acquisitions (
    id INTEGER NOT NULL, 
    id_cu INTEGER NOT NULL, 
    datetime DATETIME NOT NULL, 
    PRIMARY KEY (id), 
    FOREIGN KEY(id_cu) REFERENCES ctrl_units (id), 
)

CREATE TABLE data (
    id INTEGER NOT NULL, 
    id_acq INTEGER NOT NULL, 
    id_meas INTEGER NOT NULL, 
    value FLOAT, 
    PRIMARY KEY (id), 
    FOREIGN KEY(id_acq) REFERENCES acquisitions (id)
)

CREATE TABLE sensors (
    id_cu INTEGER NOT NULL, 
    id_meas INTEGER NOT NULL, 
    name VARCHAR(40) NOT NULL, 
    PRIMARY KEY (id_cu, id_meas), 
    FOREIGN KEY(id_cu) REFERENCES ctrl_units (id)
)

CREATE TABLE ctrl_units (
id INTEGER NOT NULL, 
name VARCHAR(40) NOT NULL, 
PRIMARY KEY (id)
)

Brief description: I have ctrl_units sending data over a network in csv like files, every ctrl_unit has many sensors, a single sensor is identified as id_cu-id_meas.

This is not a solution I found but the real way ctrl_units send data over the network.

Every acquisition is a row in csv, with datetime and many sensor measures as id_meas, value (simplified version: to say it all every sensor sends more measures in an acquisition as id_meas, id_elab, value where id_elab has different meanings according to sensor type)

Working this way data of a sensor (a result column) is identified by it's Control Unit id (stored in parent acquisition) and it's Measure Type id (stored in single data). I use this query to extract data.

In order to make it possible to preserve the ordering in the query, and allow the computed columns appear in the right place, you really need something in the view_opts table as a place holder. What about something like:

view_opts
id
id_view
isCalc bit --contains 0 for Sensor and 1 for Calc (not strictly necessary)
id_sensor --nullable
id_viewcalc --nullable
ord #for column ordering

The nullable columns should mean that fk relations can be maintained.

Then an extra table for the calcs (to avoid spamming up the view_opts table):

view_calcs
id_viewcalc
...

You then have the option to make a predefined set of calcs available and reference these, along with the required columns, or to allow a fully configurable set of calcs (I'd advise the first as the second can end up with a horrid UI).

If the users can create Views, why don't you simply GRANT CREATE VIEW TO user_role. They can write their own calculations. Then they will have the full capability of SQL Views. Anything you or I could come up with will be:

a fraction of that, and
require additional tables and maintenance

Implementation

Given that you have to implement it, then.

The best method is still to do as little in your app code, and as much in SQL and SQL facilities, as possible. Parsing the input request, and building the query is difficult enough, and you don't want to have to change that every time they come up with a new request.

What you have is fairly hard-coded and brittle; they appear to be summary tables which do only one thing. Build Vectors in the database, instead. As in, the Dimension-Fact structures that are common practice in data warehouses, as opposed to summary tables. If you post some DDL (3 or 4 key tables being used as source, for the "views" in your question), I can provide more specific directions.

Of course, if your database is not a database, that will prevent basic functions that can be expected from databases and SQL. Half the time, these filing systems just need to be re-implemented, the cost of struggling to provide ordinary functions from non-databases and to maintain them is far more than the re-implementation cost.

The other half of the time, the crippled non-database can be overcome by implementing a few SQL Views, each of which provide the facility of a Normalised table. But that needs careful analysis, not possible here on SO, before the decision as to whether that is possible, can be made. Following that, you can then build your Vectors or Dimensions on the Views, as you would on the tables in a real database.

Each Vector or Dimension is a Scalar Subquery, populating a single column. The outer query defines the structure of the result set (grid). The subqueries populate the cells.

Last, in your app, which can now be a simpler parser, given the input data, it can grab a vector, or two, or a vector divided by another vector, etc. Build the SELECT statement, and then EXECUTE.