-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathdocs_sql.html
More file actions
319 lines (258 loc) · 17.7 KB
/
docs_sql.html
File metadata and controls
319 lines (258 loc) · 17.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
<html><head>
<title>DBToaster - DBToaster SQL Reference</title>
<link rel="stylesheet" type="text/css" href="css/style.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap.min.css" media="screen" />
<link rel="stylesheet" type="text/css" href="css/bootstrap-theme.min.css" />
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
<link rel="icon" href="favicon.ico" type="image/x-icon">
<link rel="stylesheet" href="http://maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css">
<link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css">
</head> <body>
<a name="pagetop"></a>
<div class="overallpage">
<div class="pagebody">
<div class="logobox">
<a href="index.html" ><img src="dbtoaster-logo.gif" width="214"
height="100" alt="DBToaster"/></a> </div>
<div class="navbar navbar-default">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<div class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">About <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="index.html" >Welcome to dbtoaster.org</a></li>
<li><a href="home_about.html" >Is DBToaster right for you?</a></li>
<li><a href="home_performance.html" >Performance</a></li>
<li><a href="home_features.html" >Features</a></li>
<li><a href="home_features.html#roadmap" ><small><i class="fa fa-caret-square-o-right"></i> Roadmap</small></a></li>
<li><a href="home_people.html" >The Team</a></li>
<li><a href="home_research.html" >For Researchers</a></li>
</ul>
</li>
<li class="dropdown">
<!--?php if(!isset($now_building_distro)) { ?-->
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Downloads <b class="caret"></b></a>
<!--?php } else { ?-->
<!--a href="#" class="dropdown-toggle" data-toggle="dropdown">License <b class="caret"></b></a-->
<!--?php } ?-->
<ul class="dropdown-menu">
<!--?php if(!isset($now_building_distro)) { ?-->
<li><a href="download.html" >Downloads</a></li>
<!--?php } ?-->
<li><a href="download.html#license" >License</a></li>
<li><a href="download.html#changelog" >Changelog</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="docs.html" >Installation</a></li>
<li><a href="docs_start.html" >Getting Started</a></li>
<li><a href="docs_architecture.html" >Architecture</a></li>
<li><a href="docs_compiler.html" >Command-Line Reference</a></li>
<li><a href="docs_compiler.html#options" ><small><i class="fa fa-caret-square-o-right"></i> Command-Line Options</small></a></li>
<li><a href="docs_compiler.html#languages" ><small><i class="fa fa-caret-square-o-right"></i> Supported Languages</small></a></li>
<li><a href="docs_compiler.html#opt_flags" ><small><i class="fa fa-caret-square-o-right"></i> Optimization Flags</small></a></li>
<li><a href="docs_sql.html" >DBToaster SQL Reference</a></li>
<li><a href="docs_stdlib.html" >DBToaster StdLib Reference</a></li>
<li><a href="docs_adaptors.html" >DBToaster Adaptors Reference</a></li>
<li><a href="docs_cpp.html" >C++ Code Generation</a></li>
<li><a href="docs_scala.html" >Scala Code Generation</a></li>
<li><a href="docs_java.html" >DBToaster in Java Programs</a></li>
<li><a href="docs_customadaptors.html" >Custom Adaptors</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Contact <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="home_contact.html#inquiries" >Inquiries</a></li>
<li><a href="home_contact.html#mailing" >Mailing List</a></li>
<li><a href="bugs.html" >Bug Reports</a></li>
</ul>
</li>
</ul>
</div>
</div>
<div class="contentwrapper">
<div class="content">
<div class="doc_chain_link"> < <span class="doc_chain_link_prev"><a href="docs_compiler.html" >Command-Line Reference</a></span> | <span class="doc_chain_link_next"><a href="docs_stdlib.html" >DBToaster StdLib Reference</a></span> > </div> <div class="titlebox">DBToaster SQL Reference</div>
<a name="createfunction"></a>
<h3>1. CREATE FUNCTION</h3>Declare a forward reference to a user defined function in the target language.
<div class="codeblock">create_function :=
CREATE FUNCTION <name> ( <arguments> ) RETURNS <type> AS <definition>
arguments := [<var_1> <type_1> [, <var_2> <type_2> [, ...]]]
definition := EXTERNAL '<external_name>';
</div>
<p>Use create function to declare a user-defined primitive-valued function in the target language. At this time, DBToaster does not create target-language specific declarations are created, so the function must be in-scope within the generated code. Once declared, a UDF may be used in any arithmetic expression within DBToaster.
For example, the following block illustrates use of the math.h cos() and sin() functions for C++ targetted code.
<div class="codeblock">CREATE FUNCTION cos ( x double ) RETURNS double AS EXTERNAL 'sin';
CREATE FUNCTION sin ( x double ) RETURNS double AS EXTERNAL 'cos';
SELECT r.distance * cos(r.angle) AS x,
r.distance * sin(r.angle) AS y,
FROM RadialMeasurements r;
</div>
<hr/>
<a name="create"></a>
<h3>2. CREATE TABLE/STREAM</h3>Declare a relation for use in the query.
<div class="codeblock">create_statement :=
CREATE { TABLE | STREAM } <name> ( <schema> )
[<source_declaration>]
schema := [<var_1> <type_1> [, <var_2> <type_2> [, ...]]]
source_declaration := source_stream source_adaptor
source_stream :=
FROM FILE '<path>' {
FIXEDWIDTH <bytes_per_row>
| LINE DELIMITED
| '<delim_string>' DELIMITED
}
source_adaptor :=
<adaptor_name> (
[<param_1> := '<value>' [, <param_2> := '<value>' [, ...]]]
)
</div>
<p>A create statement defines a relation named <tt>name</tt> with the indicated schema and declares a method for automatically populating/updating rows of that relation.</p>
<p>Each relation may be declared to be either a Stream or a Table:
<ul>
<li>Tables are static data sources. A table is read in prior to query monitoring, and must remain constant once monitoring has started.</li>
<li>Streams are dynamic data sources. Stream updates are read in one tuple at a time as data becomes available, and query views are updated after every update to a stream.</li>
</ul></p>
<p>The source declaration allows DBToaster (either in the interpreter, or the generated source code) to automatically update the relation. The source declaration is optional when using DBToaster to generate source code. User programs may manially inject updates to relations, or manually declare sources during initialization of the DBToaster-genertaed source code.
</p>
<p>A source declaration consists of stream and adaptor components. The stream component defines where data should be read from, and how records in the data are delimited. At present, DBToaster only supports reading tuples from files.
</p>
<p>If the same file is referenced multiple times, the file will only be scanned once, and events will be generated in the order in which they appear in the file.
</p>
<p>The adaptor declares how to parse fields out of each record. See below for documentation on DBToaster's standard adaptors package.
</p>
<h4>2.1. Example</h4><div class="codeblock">CREATE STREAM R(a int, b date)
FROM FILE 'examples/data/r.dat' LINE DELIMITED
CSV (fields := '|')
</div>
<hr/>
<a name="include"></a>
<h3>3. INCLUDE</h3>Import a secondary SQL file.
<div class="codeblock">include_statement := INCLUDE 'file'
</div>
Import the contents of the selected file into DBToaster. The file path is interpreted relative to the current working directory.
<hr/>
<a name="select"></a>
<h3>4. SELECT</h3>Declare a query to monitor.
<div class="codeblock">select_statement :=
SELECT <target_1> [, <target_2> [, ...]]
FROM <source_1> [, <source_2> [, ...]]
WHERE <condition>
[GROUP BY <group_vars>]
target := <expression> [[AS] <target_name>] | * | *.*
| <source_name>.*
source := <relation_name> [[AS] <source_name>]
| (<select_statement>) [AS] <source_name>
| <source> [NATURAL] JOIN <source> [ON <condition>]
expression := (<expression>) | <int> | <float> | '<string>'
| <var> | <source>.<var>
| <expression> { + | - | * | / } <expression>
| -<expression>
| (SELECT <expression> FROM ...)
| SUM(<expression>) | COUNT(* | <expression>)
| AVG(<expression>) | COUNT(DISTINCT [var1, [var2, [...]]])
| <inline_function>([<expr_1> [, <expr_2> [, ...]]])
| DATE('yyyy-mm-dd')
| EXTRACT({year|month|day} FROM <date>)
| CASE <expression> WHEN <expression> THEN <expression> [, ...]
[ELSE <expression>] END
| CASE WHEN <condition> THEN <expression> [, ...]
[ELSE <expression>] END
condition := (<condition>) | true | false | not (<condition>)
| <expression> { < | <= | > | >= | = | <> } <expression>
| <expression> { < | <= | > | >= | = | <> } { SOME | ALL }
<select_statement>
| <condition> AND <condition> | <condition> OR <condition>
| EXISTS <select_statement>
| <expression> BETWEEN <expression> AND <expression>
| <expression> IN <select_statement>
| <expression> LIKE <matchstring>
</div>
<p>DBToaster SQL's SELECT operation differs from the SQL-92 standard. Full support for the SQL-standard SELECT is planned, and will be part of a future release.</p>
<dl>
<dt><b>Aggregates</b></dt>
<dd>DBToaster currently has support for the SUM, COUNT, COUNT DISTINCT, and AVG aggregates. MIN and MAX are not presently supported. Also, see the note on NULL values below.
<dt><b>Types</b></dt>
<dd>DBToaster presently supports integer, floating point, string, and date types. char and varchar types are treated as strings of unbounded length.</dd>
<dt><b>Conditional Predicates</b></dt>
<dd>DBToaster presently supports boolean expressions over arithmetic comparisons (=, <>, <, <=, >, >=), existential/universal quantification (SOME/ALL/EXISTS), BETWEEN, IN, and LIKE.</dd>
<dt><b>SELECT syntax</b></dt>
<dd>SELECT [FROM] [WHERE] [GROUP BY] queries are supported. The DISTINCT, UNION, LIMIT, ORDER BY, and HAVING clauses are not presently supported. The HAVING operator may be simulated by use of nested queries:
<div class="codeblock">SELECT A, SUM(B) AS sumb FROM R HAVING SUM(C) > 2</div>
is equivalent to
<div class="codeblock">SELECT A, sumb FROM (
SELECT A, SUM(B) AS sumb, SUM(C) as sumc FROM R
)
WHERE sumc > 2</div>
</dd>
<dt><b>NULL values</b></dt>
<dd>DBToaster does not presently support NULL values. The SUM or AVG of an empty table is 0, and not NULL. OUTER JOINS are not supported.</dd>
<dt><b>Incremental Computation with Floating Point Numbers</b></dt>
<dd>There are several subtle issues that arise when performing incremental computations with floating point numbers:
<ul>
<li>When using division in conjunction with aggregates, be aware that SUM and AVG return 0 for empty tables. Once a result value becomes NAN or INFTY, it will no longer be incrementally maintainable. We are working on a long-term fix. In the meantime, there are two workarounds for this problem. For some queries, you can coerce the aggregate value to be nonzero using the LISTMAX standard function (See the example query tpch/query8.sql in the DBToaster distribution for an example of how to do this). For most queries, the -F EXPRESSIVE-TLQs optimization flag will typically materialize the divisor as a separate map (the division will be evaluated when accessing results).</li>
<li>The floating point standards for most target languages (including OCaml, Scala, and C++) do not have well-defined semantics for equality tests over floating point numbers. Consequently, queries with floating-point group-by variables might produce non-unique groups (since two equivalent floating point numbers are not considered to be equal). We are working on a long-term fix. In the meantime, the issue can be addressed by using CAST_INT, or CAST_STRING to convert floating point numbers into canonical forms.</li>
</ul>
<dt><b>Other Notes</b></dt>
<dd>
<ul>
<li>DBToaster does not allow non-aggregate queries to evaluate to singleton values. That is, the query
<div class="codeblock">SELECT 1 FROM R WHERE R.A = (SELECT A FROM S)</div>
is a compile-time error in DBToaster (while such a query would instead produce a run time error if it returned more than one tuple in SQL-92). An equivalent, valid query would be:<br/>
<div class="codeblock">SELECT 1 FROM R WHERE R.A IN (SELECT A FROM S)</div></li>
<li>Variable scoping rules are slightly stricter than the SQL standard (you may need to use fully qualified names in some additional cases).</li>
</ul>
</dd>
</dl>
</p>
<p>See <a href="docs_stdlib.html" >DBToaster StdLib Reference</a> for the documentation on DBToaster's standard function library.</p>
<p>DBToaster maintains query results in the form of either multi-key dictionaries (a.k.a., maps, hashmaps, etc...), or singleton primitive-typed values. Each query result is assigned a name based on the query (see documentation for your target language's code generator for details on how to access the results).<p>
<ul>
<li>Non-aggregate queries produce a dictionary named "COUNT". Each entry in the dictionary has a key formed from the target fields of the SELECT. Values are the number of times the tuple occurs in the output (i.e., the query includes an implicit group-by COUNT(*) aggregate).</li>
<li>Singleton (non-grouping) aggregate queries produce a primitive-typed result for each aggregate target in the SELECT. The result names are assigned based on the name of each target (i.e., using the name following the optional <tt>AS</tt> clause, or a procedurally generated name otherwise).</li>
<li>Group-by aggregate queries produce a dictionary for each aggregate target. The non-aggregate (group-by) targets are used as keys for each entry (as for non-aggregate queries), and the value is the aggregate value for each group. The dictionaries are named based on the name of each aggregate target (as for singleton aggregate queries)</li>
</ul></p>
<p>If multiple SELECT statements occur in the same file, the result names of each query will be prefixed with "QUERY#_", where # is an integer.</p>
<h3>5. EXAMPLES</h3>
<div class="codeblock">CREATE STREAM R(A int, B int);
CREATE STREAM S(B int, C int);</div>
<h4>5.1. Non-aggregate query</h4><div class="codeblock">SELECT * FROM R;</div>
Generates a single dictionary named COUNT, mapping from the tuple "<R.A, R.B>" to the number of time each tuple occurs in R.
<h4>5.2. Aggregate query</h4><div class="codeblock">SELECT SUM(R.A * S.C) AS sum_ac FROM R NATURAL JOIN S;</div>
Generates a single constant integer named SUM_AC containing the query result.
<h4>5.3. Aggregate group-by query (one group-by var)</h4><div class="codeblock">SELECT S.C, SUM(R.A) AS sum_a
FROM R NATURAL JOIN S
GROUP BY S.C;</div>
Generates a dictionary named SUM_A mapping from values of S.C to the sums of R.A.
<h4>5.4. Aggregate group-by query (multiple group-by vars)</h4><div class="codeblock">SELECT R.A, R.B, COUNT(*) AS foo FROM R GROUP BY R.A, R.B;</div>
Generates a single dictionary named FOO, mapping from the tuple "<R.A, R.B>" to the number of time each tuple occurs in R.
<h4>5.5. Query with multiple aggregates</h4><div class="codeblock">SELECT SUM(R.A) AS sum_a, SUM(S.C) AS sum_c
FROM R NATURAL JOIN S
GROUP BY S.C;</div>
Generates two dictionaries named SUM_A and SUM_C, respectively containing the sums of R.A and S.C.
<h4>5.6. Multiple Queries</h4><div class="codeblock">SELECT SUM(R.A) AS SUM_A FROM R;
SELECT SUM(S.C) SUM_C FROM S;</div>
Generates two dictionaries named QUERY_1_SUM_A and QUERY_2_SUM_C, respectively containing the sums of R.A and S.C.
<div class="doc_chain_link"> < <span class="doc_chain_link_prev"><a href="docs_compiler.html" >Command-Line Reference</a></span> | <span class="doc_chain_link_next"><a href="docs_stdlib.html" >DBToaster StdLib Reference</a></span> > </div> </div><!-- /content -->
</div><!-- /contentwrapper -->
</div><!-- /pagebody -->
<div class="footer">
<hr/>
<p>Copyright (c) 2009-2014, The DBToaster Consortium. All rights reserved.</p>
</div>
</div><!-- /overallpage -->
<div id="pageEndElem" style="padding: 0 20px;"></div>
<script type="text/javascript" src="js/jquery-2.0.3.min.js"> </script>
<script type="text/javascript" src="js/bootstrap.min.js"> </script>
</body>
</html>