-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdocs_scala.html
414 lines (349 loc) · 15.5 KB
/
docs_scala.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
<html><head>
<title>DBToaster - Scala Code Generation</title>
<link rel="stylesheet" type="text/css" href="css/style.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap.min.css" media="screen" />
<link rel="stylesheet" type="text/css" href="css/bootstrap-theme.min.css" />
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
<link rel="icon" href="favicon.ico" type="image/x-icon">
<link rel="stylesheet" href="http://maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css">
<link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css">
</head> <body>
<a name="pagetop"></a>
<div class="overallpage">
<div class="pagebody">
<div class="logobox">
<a href="index.html" ><img src="dbtoaster-logo.gif" width="214"
height="100" alt="DBToaster"/></a> </div>
<div class="navbar navbar-default">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<div class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">About <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="index.html" >Welcome to dbtoaster.org</a></li>
<li><a href="home_about.html" >Is DBToaster right for you?</a></li>
<li><a href="home_performance.html" >Performance</a></li>
<li><a href="home_features.html" >Features</a></li>
<li><a href="home_features.html#roadmap" ><small><i class="fa fa-caret-square-o-right"></i> Roadmap</small></a></li>
<li><a href="home_people.html" >The Team</a></li>
<li><a href="home_research.html" >For Researchers</a></li>
</ul>
</li>
<li class="dropdown">
<!--?php if(!isset($now_building_distro)) { ?-->
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Downloads <b class="caret"></b></a>
<!--?php } else { ?-->
<!--a href="#" class="dropdown-toggle" data-toggle="dropdown">License <b class="caret"></b></a-->
<!--?php } ?-->
<ul class="dropdown-menu">
<!--?php if(!isset($now_building_distro)) { ?-->
<li><a href="download.html" >Downloads</a></li>
<!--?php } ?-->
<li><a href="download.html#license" >License</a></li>
<li><a href="download.html#changelog" >Changelog</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="docs.html" >Installation</a></li>
<li><a href="docs_start.html" >Getting Started</a></li>
<li><a href="docs_architecture.html" >Architecture</a></li>
<li><a href="docs_compiler.html" >Command-Line Reference</a></li>
<li><a href="docs_compiler.html#options" ><small><i class="fa fa-caret-square-o-right"></i> Command-Line Options</small></a></li>
<li><a href="docs_compiler.html#languages" ><small><i class="fa fa-caret-square-o-right"></i> Supported Languages</small></a></li>
<li><a href="docs_compiler.html#opt_flags" ><small><i class="fa fa-caret-square-o-right"></i> Optimization Flags</small></a></li>
<li><a href="docs_sql.html" >DBToaster SQL Reference</a></li>
<li><a href="docs_stdlib.html" >DBToaster StdLib Reference</a></li>
<li><a href="docs_adaptors.html" >DBToaster Adaptors Reference</a></li>
<li><a href="docs_cpp.html" >C++ Code Generation</a></li>
<li><a href="docs_scala.html" >Scala Code Generation</a></li>
<li><a href="docs_java.html" >DBToaster in Java Programs</a></li>
<li><a href="docs_customadaptors.html" >Custom Adaptors</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Contact <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="home_contact.html#inquiries" >Inquiries</a></li>
<li><a href="home_contact.html#mailing" >Mailing List</a></li>
<li><a href="bugs.html" >Bug Reports</a></li>
</ul>
</li>
</ul>
</div>
</div>
<div class="contentwrapper">
<div class="content">
<div class="doc_chain_link"> < <span class="doc_chain_link_prev"><a href="docs_cpp.html" >C++ Code Generation</a></span> | <span class="doc_chain_link_next"><a href="docs_java.html" >DBToaster in Java Programs</a></span> > </div> <div class="titlebox">Scala Code Generation</div><div class="warning">Warning: This API is subject to changes in future releases.</div>
<p>
<i>Note:</i> To compile and run queries using the Scala backend requires the Scala compiler. Please refer to <a href="docs.html" >Installation</a> for more details.
</p>
<a name="quickstart"></a>
<h3>1. Compiling and running a query</h3><p>
DBToaster generates a JAR file for a query when using the <tt>-l scala</tt> and <tt>-c <file></tt> switches:
</p>
<div class="codeblock">
$> cat examples/queries/simple/rst.sql
CREATE STREAM R(A int, B int)
FROM FILE 'examples/data/simple/r.dat' LINE DELIMITED csv;
CREATE STREAM S(B int, C int)
FROM FILE 'examples/data/simple/s.dat' LINE DELIMITED csv;
CREATE STREAM T(C int, D int)
FROM FILE 'examples/data/simple/t.dat' LINE DELIMITED csv;
SELECT sum(A*D) AS AtimesD FROM R,S,T WHERE R.B=S.B AND S.C=T.C;
$> bin/dbtoaster -c test.jar -l scala examples/queries/simple/rst.sql
</div>
<p>
The command above compiles the query to <tt>test.jar</tt>, which can be run as follows:
</p>
<div class="codeblock">
$> java -classpath "test.jar:lib/dbt_scala/*" ddbt.gen.Dbtoaster
Java 1.7.0_45, Scala 2.10.3
Time: 0.008s (30/0)
ATIMESD:
306
</div>
<p>
After processing all insertions and deletions, the final result is printed.
</p>
<p>
<i>Note for Windows users:</i> When running compiled Scala programs under Cygwin or Windows directly, one should use Windows-style classpath separators (i.e., semicolons). For instance:
</p>
<div class="codeblock">
$> java -cp ".\test.jar;.\lib\dbt_scala\akka-actor_2.10-2.2.3.jar;.\lib\dbt_scala\dbtoaster_2.10-2.1-lms.jar;.\lib\dbt_scala\scala-library-2.10.2.jar;.\lib\dbt_scala\config-1.0.2.jar" ddbt.gen.Dbtoaster
Java 1.7.0_65, Scala 2.10.2
Time: 0.002s (30/0)
ATIMESD:
306
</div>
<a name="apiguide"/></a>
<h3>2. Scala API Guide</h3><p>
In the previous example, we used the standard main function to test the query.
However, to use the query in real applications, it has to be run from within an application.
</p>
<p>
The following listing shows a simple example application that communicates with the query class.
The communication between the application and the query class is handled using <a href="http://akka.io/">akka</a>.
</p>
<div class="codeblock">
package org.dbtoaster
import ddbt.gen._
import ddbt.lib.Messages._
import akka.actor._
object ExampleApp {
val DEFAULT_TIMEOUT = akka.util.Timeout(1L << 42)
def main(args: Array[String]) {
val system = ActorSystem("mySystem")
val q = system.actorOf(Props[Dbtoaster], "Query")
var result:(StreamStat,List[Any]) = null
// Send events
println("Insert a tuple into R.")
q ! TupleEvent(TupleInsert, "R", List(5L, 2L))
println("Insert a tuple into S.")
q ! TupleEvent(TupleInsert, "S", List(2L, 3L))
println("Insert a tuple into T.")
q ! TupleEvent(TupleInsert, "T", List(3L, 4L))
result = scala.concurrent.Await.result(akka.pattern.ask(q, GetSnapshot(List(1)))(DEFAULT_TIMEOUT), DEFAULT_TIMEOUT.duration).asInstanceOf[(StreamStat,List[Any])]
println("Result after this step: " + result._2(0))
println("Insert another tuple into T.");
q ! TupleEvent(TupleInsert, "T", List(3L, 5L))
// Retrieve result
result = scala.concurrent.Await.result(akka.pattern.ask(q, EndOfStream)(DEFAULT_TIMEOUT), DEFAULT_TIMEOUT.duration).asInstanceOf[(StreamStat,List[Any])]
println("Final Result: " + result._2(0))
system.shutdown
}
}
</div>
<p>
This example first creates an ActorSystem and then launches the query actor.
The events are sent to the query actor using <tt>TupleEvent</tt> messages with the following structure:
</p>
<table class="table">
<tr>
<th>Argument</th>
<th>Comment</th>
</tr>
<!--tr>
<td class="code">ord : Int</td>
<td>Order number of the event.</td>
</tr-->
<tr>
<td class="code">op : TupleOp</td>
<td><tt>TupleInsert</tt> for insertion, <tt>TupleDelete</tt> for deletion.</td>
</tr>
<tr>
<td class="code">stream : String</td>
<td>Name of the stream as it appears in the SQL file.</td>
</tr>
<tr>
<td class="code">data : List[Any]</td>
<td>The values of the tuple being inserted into/deleted from the stream.</td>
</tr>
</table>
<p>
To retrieve the final result, an <tt>EndOfStream</tt> message is sent to the query actor.
Alternatively the intermediate result of a query can be retrieved using a <tt>GetSnapshot</tt> message with the following structure:
</p>
<table class="table">
<tr>
<th>Argument</th>
<th>Comment</th>
</tr>
<tr>
<td class="code">view : List[Int]</td>
<td>List of maps that a snapshot is taken of.</td>
</tr>
</table>
<p>
Assuming that the example code has been saved as <tt>example.scala</tt>, it can be compiled with:
</p>
<div class="codeblock">
$> scalac -classpath "rst.jar:lib/dbt_scala/*" -d example.jar example.scala
</div>
<p>
It can then be launched with the following command:
</p>
<div class="codeblock">
$> java -classpath "rst.jar:lib/dbt_scala/*:example.jar" org.dbtoaster.ExampleApp
Insert a tuple into R.
Insert a tuple into S.
Insert a tuple into T.
Result after this step: 20
Insert another tuple into T.
Final Result: 45
</div>
<a name="generatedcode"/></a>
<h3>3. Generated Code Reference</h3><p>
The Scala code generator generates a single file containing an object, a base class containing trigger functions, and an actor class for the query. The base name is derived from the query name unless the <tt>-n <name></tt> flag is used.
</p>
<p>
The code generated for the previous example looks as follows:
</p>
<div class="codeblock">
package ddbt.gen
import ddbt.lib._
...
// Query object used for standalone binaries
object Rst {
import Helper._
def execute(args:Array[String],f:List[Any]=>Unit) = ...
def main(args:Array[String]) {
execute(args,(res:List[Any])=>{
println("ATIMESD:\n"+M3Map.toStr(res(0))+"\n")
})
}
}
// Base class
class RstBase {
import Rst._
import ddbt.lib.Functions._
// Maps/singletons that hold intermediate results
var ATIMESD = 0L
val ATIMESD_mT1 = M3Map.make[Long,Long]();
...
// Triggers
def onAddR(r_a:Long, r_b:Long) {
ATIMESD += (ATIMESD_mR1.get(r_b) * r_a);
ATIMESD_mT1_mR1.slice(0,r_b).foreach { (k1,v1) =>
val atimesd_mtt_c = k1._2;
ATIMESD_mT1.add(atimesd_mtt_c,(v1 * r_a));
}
ATIMESD_mS1.add(r_b,r_a);
}
def onDelR(r_a:Long, r_b:Long) { ... }
...
def onDelT(t_c:Long, t_d:Long) { ... }
def onSystemReady() { }
}
// Query actor
class Rst extends RstBase with Actor {
import Rst._
import ddbt.lib.Messages._
def receive = {
case TupleEvent(TupleInsert,"R",List(v0:Long,v1:Long)) => { ... } onAddR(v0,v1)
...
case StreamInit(timeout) => onSystemReady(); { ... }
case EndOfStream | GetSnapshot(_) => { ... } sender ! (StreamStat(t1-t0,tN,tS),List(ATIMESD))
}
}
</div>
<h4>3.1. The query object</h4><p>
The query object contains the code used by the standalone binary to execute the query.
Its <tt>execute</tt> method reads from the input streams specified in the query file and sends them to the query actor.
The <tt>main</tt> method calls execute and prints the result when all tuples have been processed.
</p>
<h4>3.2. The query base class</h4><p>
The actual query processor lives in the query base class.
For every stream <tt>R</tt>, there is an insertion <tt>onAddR</tt> and a deletion trigger <tt>onDelR</tt>.
These trigger methods are responsible for updating the intermediate result.
The map and singleton data structures at the top of the base class hold the intermediate result.
</p>
<p>
The <tt>onSystemReady</tt> trigger is responsible of loading static information (<tt>CREATE TABLE</tt> statements in the query file) before the actual processing begins.
</p>
<h4>3.3. The query actor</h4><p>
The query actor class dispatches events like tuple insertions and deletions to the base class using the <tt>receive</tt> method.
</p>
<p>
The <tt>EndOfStream</tt> message is sent from the event source when it is exhausted. The query actor replies to this message with the current processing statistics (processing time, number of tuples processed, number of tuples skipped) and one or multiple query results.
</p>
<p>
The <tt>GetSnapshot</tt> message can be used by an application to access the intermediate result. The query actor replies to this message with the current processing statistics and the results that the message asks for.
</p>
<p>
The whole process is guarded by a timeout. If the timeout is reached, the actor will stop to process tuples.
</p>
<!--?=section("Partial materialization")?>
<p>
Some of the work involved in maintaining the results of a query can be saved by performing partial materialization and only materialize the result when requested (i.e. when all tuples have been processed).
This behaviour is especially desirable when the rate of querying the results is lower than the rate of updates, and can be enabled through the <tt>-F EXPRESSIVE-TLQS</tt> command line flag.
</p>
<p>
Below is an example of a query where partial materialization is indeed beneficial (this query can be found as <tt>examples/queries/simple/r_lift_of_count.sql</tt> in the DBToaster download).
</p>
<div class="codeblock">
CREATE STREAM R(A int, B int)
FROM FILE 'examples/data/tiny/r.dat' LINE DELIMITED
csv ();
SELECT r2.C FROM (
SELECT r1.A, COUNT(*) AS C FROM R r1 GROUP BY r1.A
) r2;
</div>
<p>
When compiling this query with <tt>-F EXPRESSIVE-TLQS</tt>, the generated code now has functions representing top-level results:
</p>
<div class="codeblock">
def COUNT() = {
val mCOUNT = M3Map.make[Long,Long]()
val agg1 = M3Map.temp[Long,Long]()
COUNT_1_E1_1.foreach { (r2_a,v1) =>
val l1 = COUNT_1_E1_1.get(r2_a);
agg1.add(l1,(if (v1 != 0) 1L else 0L));
}
agg1.foreach { (r2_c,v2) =>
mCOUNT.add(r2_c,v2)
}
mCOUNT
}
</div-->
<div class="doc_chain_link"> < <span class="doc_chain_link_prev"><a href="docs_cpp.html" >C++ Code Generation</a></span> | <span class="doc_chain_link_next"><a href="docs_java.html" >DBToaster in Java Programs</a></span> > </div> </div><!-- /content -->
</div><!-- /contentwrapper -->
</div><!-- /pagebody -->
<div class="footer">
<hr/>
<p>Copyright (c) 2009-2014, The DBToaster Consortium. All rights reserved.</p>
</div>
</div><!-- /overallpage -->
<div id="pageEndElem" style="padding: 0 20px;"></div>
<script type="text/javascript" src="js/jquery-2.0.3.min.js"> </script>
<script type="text/javascript" src="js/bootstrap.min.js"> </script>
</body>
</html>