Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
53f7660
Basic compilation against TinkerPop 3.1.0-SNAPSHOT
spmallette Sep 4, 2015
d302e15
Merge remote-tracking branch 'origin/titan09' into titan09-ci-31
spmallette Sep 4, 2015
1053f02
Merge remote-tracking branch 'origin/titan09' into titan09-ci-31
spmallette Sep 4, 2015
a23bfa5
Remove OptOuts for tests that are no longer present in TinkerPop test…
spmallette Sep 10, 2015
ab01284
Deal with shading of jackson dependencies in TinkerPop 3.1.0.
spmallette Sep 10, 2015
fe454df
Merge remote-tracking branch 'origin/titan09' into tp3-ci-31
spmallette Sep 16, 2015
9e43e49
add CompactionStrategy options for Cassandra storage configuration
pluradj Sep 23, 2015
e531278
Merge remote-tracking branch 'origin/titan10' into tp3-ci-31
spmallette Oct 1, 2015
92ce7ef
Get Titan building given change to TP3 around hadoop support.
spmallette Oct 1, 2015
b8c6f36
renamed VendorOptimizationStrategy to ProviderOptimizationStrategy.
okram Oct 12, 2015
c9ebb1c
TitanGraphStep is now a able to be used mid-traversal. TinkerPop3.1 u…
okram Oct 29, 2015
6d2f7e1
GraphComputer.configure() now has a default implementation. Thus, bac…
okram Nov 2, 2015
daf7682
Added fix to be applied to after TINKERPOP3-885 is merged.
spmallette Nov 7, 2015
f676b04
Merge branch 'tp3-ci-31-tx' into tp3-ci-31
spmallette Nov 9, 2015
04c578b
Upgrade tinkerpop 3.1.0-incubating
christianbellinaef Nov 20, 2015
d3006dc
Handle self-loop edges in CassandraInputFormat.
Nov 25, 2015
f26d0de
Make TitanVertexDeserializer (refCounter) a static object.
Nov 26, 2015
bb4f61b
Merge pull request #1192 from englishtown/tp3-ci-31
spmallette Dec 2, 2015
b8d1e60
remove rexster dir and reference in pom.xml
Dec 11, 2015
3e93f5c
version update slf4j
Dec 11, 2015
63173d9
fix nullpointer in TitanGraphStepStrategyTest testcase
Dec 15, 2015
1eacc0a
Fixing TitanIndexTest
dalaro Dec 15, 2015
931cc0d
Clean up some mentions of Blueprints in titanbasics #1165
spmallette Dec 16, 2015
1071067
Merge pull request #1207 from graben1437/removerexster
spmallette Dec 16, 2015
80d5993
Removed as many rexster references as made sense.
spmallette Dec 16, 2015
6d8fdc9
Merge branch 'tp3-ci-31' of github.com:thinkaurelius/titan into tp3-c…
dalaro Dec 17, 2015
63d821b
Switch back to Hadoop 2
dalaro Dec 17, 2015
948a279
Set version to 1.1.0-SNAPSHOT
dalaro Dec 17, 2015
b24439d
Merge branch 'compaction_strategy' of https://github.com/pluradj/tita…
dalaro Dec 17, 2015
a5da58a
Merge branch 'pluradj-compaction_strategy' into titan11
dalaro Dec 17, 2015
d647a69
Apply C* compaction options only when present
dalaro Dec 17, 2015
94a6ff6
Update versions to sate Spark & Gremlin-Server
dalaro Dec 17, 2015
7fa1680
Merge branch 'titan10' into titan11
dalaro Dec 17, 2015
1757146
Add missing statement-terminating semicolon
dalaro Dec 17, 2015
671cbd3
Manually commit test transaction
dalaro Dec 17, 2015
7244ede
Adding test to cover #1195
dalaro Dec 17, 2015
d56fe1c
Merge branch 'titan10' of git://github.com/adeandrade/titan into issu…
dalaro Dec 17, 2015
3e16b47
Merge branch 'issue_1195-adeandrade' into titan11
dalaro Dec 17, 2015
5d2aaa5
Merge branch 'static-connection' of git://github.com/adeandrade/titan…
dalaro Dec 17, 2015
619ea90
Tweaks to #1197
dalaro Dec 17, 2015
ead4a3d
Fix test failure caused by #1197
dalaro Dec 17, 2015
9f4d852
Merge branch 'issue_1197-adeandrade' into titan11
dalaro Dec 17, 2015
e1e8ddd
Merge branch 'update-slf4j-version' of git://github.com/graben1437/ti…
dalaro Dec 17, 2015
e613110
Merge branch 'issue_1208-graben1437-update-slf4j-version' into titan11
dalaro Dec 17, 2015
b3430c3
fixed typo causing build break on titan-dist hadoop2
pluradj Dec 17, 2015
3a220c9
Merge branch 'pluradj-titan11' into titan11
dalaro Dec 21, 2015
b4d8ec1
Updated the included sample init script for Gremlin Server.
spmallette Dec 21, 2015
e9c9c60
Added some docs for connecting to Titan Server with gremlin driver.
spmallette Dec 21, 2015
edd4abe
Added the TitanIoRegistry to the list of the classes imported by the …
spmallette Dec 21, 2015
7f4cfad
Merge branch 'fixtestnullptr1' of git://github.com/graben1437/titan i…
dalaro Dec 22, 2015
adb2c2f
Merge branch 'graben1437-fixtestnullptr1' into titan11
dalaro Dec 22, 2015
2c2cd98
More efficient way to handle self-loops (an edge connecting a vertex …
Dec 27, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/hbase.txt
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ image:titan-modes-rexster.png[]
Finally, Gremlin server can be wrapped around each Titan instance defined in the previous subsection. In this way, the end-user application need not be a Java-based application as it can communicate with Gremlin server as a client. This type of deployment is great for polyglot architectures where various components written in different languages need to reference and compute on the graph.

----
http://rexster.titan.machine1/mygraph/vertices/1
http://rexster.titan.machine2/mygraph/tp/gremlin?script=g.v(1).out('follows').out('created')
http://gremlin-server.titan.machine1/mygraph/vertices/1
http://gremlin-server.titan.machine2/mygraph/tp/gremlin?script=g.v(1).out('follows').out('created')
----

In this case, each Gremlin server server would be configured to connect to the HBase cluster. The following shows the graph specific fragment of the Gremlin server configuration. Refer to <<server>> for a complete example and more information on how to configure the server.
Expand Down
49 changes: 28 additions & 21 deletions docs/titanbasics.txt
Original file line number Diff line number Diff line change
Expand Up @@ -306,11 +306,6 @@ Defining Vertex Labels

Like edges, vertices have labels. Unlike edge labels, vertex labels are optional. Vertex labels are useful to distinguish different types of vertices, e.g. _user_ vertices and _product_ vertices.

For compatibility with Blueprints, Titan provides differently-named methods for adding labeled and unlabeled vertices:

* `addVertexWithLabel`
* `addVertex`

Although labels are optional at the conceptual and data model level, Titan assigns all vertices a label as an internal implementation detail. Vertices created by the `addVertex` methods use Titan's default label.

To create a label, call `makeVertexLabel(String).make()` on an open graph or management transaction and provide the name of the vertex label as the argument. Vertex label names must be unique in the graph.
Expand Down Expand Up @@ -625,6 +620,18 @@ The `:remote` command tells the console to configure a remote connection to Grem
[TIP]
To start Titan Server with the REST API, find the `conf/gremlin-server/gremlin-server.yaml` file in the distribution and edit it. Modify the `channelizer` setting to be `org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer` then start Titan Server.

One might also connect to Gremlin Server with any of the supported available drivers. TinkerPop ships with a link:http://tinkerpop.apache.org/docs/3.1.0-incubating/#_connecting_via_java[Java-based driver] that can be used for this purpose. When using the Java driver with Titan, it is important to consider the serialization settings that the driver requires. By default, the driver uses TinkerPop's Gryo format and therefore needs some important classes registered with the driver to deserialize results from the server. Usage looks like this:

[source,java]
----
GryoMapper mapper = GryoMapper.build().addRegistry(TitanIoRegistry.INSTANCE).create();
Cluster cluster = Cluster.build().serializer(new GryoMessageSerializerV1d0(mapper)).create();
Client client = cluster.connect();
client.submit("g.V()").all().get();
----

By adding the `TitanIoRegistry` to the `org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0`, the driver will know how to properly deserialize custom data types returned by Titan.

[[indexes]]
Indexing for better Performance
-------------------------------
Expand Down Expand Up @@ -943,7 +950,7 @@ This section describes Titan's transactional semantics and API.
Transaction Handling
~~~~~~~~~~~~~~~~~~~~

Every graph operation in Titan occurs within the context of a transaction. According to the Blueprints' specification, each thread opens its own transaction against the graph database with the first operation (i.e. retrieval or mutation) on the graph::
Every graph operation in Titan occurs within the context of a transaction. According to the TinkerPop's transactional specification, each thread opens its own transaction against the graph database with the first operation (i.e. retrieval or mutation) on the graph:

[source, gremlin]
----
Expand All @@ -958,15 +965,15 @@ In this example, a local Titan graph database is opened. Adding the vertex "juno
Transactional Scope
~~~~~~~~~~~~~~~~~~~

All graph elements (vertices, edges, and types) are associated with the transactional scope in which they were retrieved or created. Under Blueprint's default transactional semantics, transactions are automatically created with the first operation on the graph and closed explicitly using `commit()` or `rollback()`. Once the transaction is closed, all graph elements associated with that transaction become stale and unavailable. However, Titan will automatically transition vertices and types into the new transactional scope as shown in this example::
All graph elements (vertices, edges, and types) are associated with the transactional scope in which they were retrieved or created. Under TinkerPop's default transactional semantics, transactions are automatically created with the first operation on the graph and closed explicitly using `commit()` or `rollback()`. Once the transaction is closed, all graph elements associated with that transaction become stale and unavailable. However, Titan will automatically transition vertices and types into the new transactional scope as shown in this example:

[source, gremlin]
graph = TitanFactory.open("berkeleyje:/tmp/titan")
juno = graph.addVertex() //Automatically opens a new transaction
graph.tx().commit() //Ends transaction
juno.property("name", "juno") //Vertex is automatically transitioned

Edges, on the other hand, are not automatically transitioned and cannot be accessed outside their original transaction. They must be explicitly transitioned.
Edges, on the other hand, are not automatically transitioned and cannot be accessed outside their original transaction. They must be explicitly transitioned:

[source, gremlin]
e = juno.addEdge("knows", graph.addVertex())
Expand All @@ -977,7 +984,7 @@ e.property("time", 99)
Transaction Failures
~~~~~~~~~~~~~~~~~~~~

When committing a transaction, Titan will attempt to persist all changes to the storage backend. This might not always be successful due to IO exceptions, network errors, machine crashes or resource unavailability. Hence, transactions can fail. In fact, transactions *will eventually fail* in sufficiently large systems. Therefore, we highly recommend that your code expects and accommodates such failures.
When committing a transaction, Titan will attempt to persist all changes to the storage backend. This might not always be successful due to IO exceptions, network errors, machine crashes or resource unavailability. Hence, transactions can fail. In fact, transactions *will eventually fail* in sufficiently large systems. Therefore, we highly recommend that your code expects and accommodates such failures:

[source, gremlin]
try {
Expand All @@ -995,7 +1002,7 @@ The example above demonstrates a simplified user signup implementation where `na

If the transaction fails, a `TitanException` is thrown. There are a variety of reasons why a transaction may fail. Titan differentiates between _potentially temporary_ and _permanent_ failures.

Potentially temporary failures are those related to resource unavailability and IO hickups (e.g. network timeouts). Titan automatically tries to recover from temporary failures by retrying to persist the transactional state after some delay. The number of retry attempts and the retry delay are configurable (see <<titan-config-ref>>).
Potentially temporary failures are those related to resource unavailability and IO hiccups (e.g. network timeouts). Titan automatically tries to recover from temporary failures by retrying to persist the transactional state after some delay. The number of retry attempts and the retry delay are configurable (see <<titan-config-ref>>).

Permanent failures can be caused by complete connection loss, hardware failure or lock contention. To understand the cause of lock contention, consider the signup example above and suppose a user tries to signup with username "juno". That username may still be available at the beginning of the transaction but by the time the transaction is committed, another user might have concurrently registered with "juno" as well and that transaction holds the lock on the username therefore causing the other transaction to fail. Depending on the transaction semantics one can recover from a lock contention failure by re-running the entire transaction.

Expand All @@ -1008,30 +1015,30 @@ Permanent exceptions that can fail a transaction include:
Multi-Threaded Transactions
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Titan supports multi-threaded transactions through Blueprint's http://tinkerpop.incubator.apache.org/docs/{tinkerpop_version}/#transactions[ThreadedTransactionalGraph] interface. Hence, to speed up transaction processing and utilize multi-core architectures multiple threads can run concurrently in a single transaction.
Titan supports multi-threaded transactions through TinkerPop's http://tinkerpop.incubator.apache.org/docs/{tinkerpop_version}/#_threaded_transactions[threaded transactions]. Hence, to speed up transaction processing and utilize multi-core architectures multiple threads can run concurrently in a single transaction.

With Blueprints' default transaction handling, each thread automatically opens its own transaction against the graph database. To open a thread-independent transaction, use the `newTransaction()` method.
With TinkerPop's default transaction handling, each thread automatically opens its own transaction against the graph database. To open a thread-independent transaction, use the `newThreadedTx()` method.

[source, gremlin]
tx = graph.newTransaction();
threadedGraph = graph.tx().newThreadedTx();
threads = new Thread[10];
for (int i=0; i<threads.length; i++) {
threads[i]=new Thread({
println("Do something");
println("Do something with 'threadedGraph''");
});
threads[i].start();
}
for (int i=0; i<threads.length; i++) threads[i].join();
tx.commit();
threadedGraph.tx().commit();

The `newTransaction()` method returns a new `TransactionalGraph` object that represents this newly opened transaction. The graph object `tx` supports all of the methods that the original graph did, but does so without opening new transactions for each thread. This allows us to start multiple threads which all work concurrently in the same transaction and one of which finally commits the transaction when all threads have completed their work.
The `newThreadedTx()` method returns a new `Graph` object that represents this newly opened transaction. The graph object `tx` supports all of the methods that the original graph did, but does so without opening new transactions for each thread. This allows us to start multiple threads which all work concurrently in the same transaction and one of which finally commits the transaction when all threads have completed their work.

Titan relies on optimized concurrent data structures to support hundreds of concurrent threads running efficiently in a single transaction.

Concurrent Algorithms
~~~~~~~~~~~~~~~~~~~~~

Thread independent transactions started through `newTransaction()` are particularly useful when implementing concurrent graph algorithms. Most traversal or message-passing (ego-centric) like graph algorithms are http://en.wikipedia.org/wiki/Embarrassingly_parallel[embarrassingly parallel] which means they can be parallelized and executed through multiple threads with little effort. Each of these threads can operate on a single `TransactionalGraph` object returned by `newTransaction` without blocking each other.
Thread independent transactions started through `newThreadedTx()` are particularly useful when implementing concurrent graph algorithms. Most traversal or message-passing (ego-centric) like graph algorithms are http://en.wikipedia.org/wiki/Embarrassingly_parallel[embarrassingly parallel] which means they can be parallelized and executed through multiple threads with little effort. Each of these threads can operate on a single `Graph` object returned by `newThreadedTx()` without blocking each other.

Nested Transactions
~~~~~~~~~~~~~~~~~~~
Expand All @@ -1054,7 +1061,7 @@ One way around this is to create the vertex in a short, nested thread-independen
[source, gremlin]
v1 = graph.addVertex()
//Do many other things
tx = graph.newTransaction()
tx = graph.tx().newThreadedTx()
v2 = tx.addVertex()
v2.property("uniqueName", "foo")
tx.commit() // Any lock contention will be detected here
Expand All @@ -1068,7 +1075,7 @@ Common Transaction Handling Problems

Transactions are started automatically with the first operation executed against the graph. One does NOT have to start a transaction manually. The method `newTransaction` is used to start <<multi-thread-tx, multi-threaded transactions>> only.

Transactions are automatically started under the Blueprints semantics but *not* automatically terminated. Transactions have to be terminated manually with `g.commit()` if successful or `g.rollback()` if not. Manual termination of transactions is necessary because only the user knows the transactional boundary.
Transactions are automatically started under the TinkerPop semantics but *not* automatically terminated. Transactions have to be terminated manually with `g.commit()` if successful or `g.rollback()` if not. Manual termination of transactions is necessary because only the user knows the transactional boundary.
A transaction will attempt to maintain its state from the beginning of the transaction. This might lead to unexpected behavior in multi-threaded applications as illustrated in the following artificial example::

[source, gremlin]
Expand Down Expand Up @@ -1174,7 +1181,7 @@ The configuration option `cache.db-cache-size` controls how much heap space Tita

The cache size can be configured as a percentage (expressed as a decimal between 0 and 1) of the total heap space available to the JVM running Titan or as an absolute number of bytes.

Note, that the cache size refers to the amount of heap space that is exclusively occupied by the cache. Titan's other data structures and each open transaction will occupy additional heap space. If additional software layers are running in the same JVM, those may occupy a significant amount of heap space as well (e.g. Rexster, embedded Cassandra, etc). Be conservative in your heap memory estimation. Configuring a cache that is too large can lead to out-of-memory exceptions and excessive GC.
Note, that the cache size refers to the amount of heap space that is exclusively occupied by the cache. Titan's other data structures and each open transaction will occupy additional heap space. If additional software layers are running in the same JVM, those may occupy a significant amount of heap space as well (e.g. Gremlin Server, embedded Cassandra, etc). Be conservative in your heap memory estimation. Configuring a cache that is too large can lead to out-of-memory exceptions and excessive GC.

Clean Up Wait Time
^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -1359,7 +1366,7 @@ When launching Titan with embedded Cassandra, the following warnings may be disp

`958 [MutationStage:25] WARN org.apache.cassandra.db.Memtable - MemoryMeter uninitialized (jamm not specified as java agent); assuming liveRatio of 10.0. Usually this means cassandra-env.sh disabled jamm because you are using a buggy JRE; upgrade to the Sun JRE instead`

Cassandra uses a Java agent called `MemoryMeter` which allows it to measure the actual memory use of an object, including JVM overhead. To use https://github.com/jbellis/jamm[JAMM] (Java Agent for Memory Measurements), the path to the JAMM jar must be specific in the Java javaagent parameter when launching the JVM (e.g. `-javaagent:path/to/jamm.jar`) through either titan.sh, gremlin.sh, or Rexster:
Cassandra uses a Java agent called `MemoryMeter` which allows it to measure the actual memory use of an object, including JVM overhead. To use https://github.com/jbellis/jamm[JAMM] (Java Agent for Memory Measurements), the path to the JAMM jar must be specific in the Java javaagent parameter when launching the JVM (e.g. `-javaagent:path/to/jamm.jar`) through either titan.sh, gremlin.sh, or Gremlin Server:

[source, bash]
export TITAN_JAVA_OPTS=-javaagent:$TITAN_HOME/lib/jamm-$MAVEN{jamm.version}.jar
Expand Down
22 changes: 15 additions & 7 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
</parent>
<groupId>com.thinkaurelius.titan</groupId>
<artifactId>titan</artifactId>
<version>1.0.1-SNAPSHOT</version>
<version>1.1.0-SNAPSHOT</version>
<packaging>pom</packaging>
<prerequisites>
<maven>2.2.1</maven>
Expand Down Expand Up @@ -59,15 +59,15 @@
</scm>
<properties>
<titan.compatible.versions />
<tinkerpop.version>3.0.2-incubating</tinkerpop.version>
<tinkerpop.version>3.1.0-incubating</tinkerpop.version>
<junit.version>4.12</junit.version>
<mrunit.version>1.1.0</mrunit.version>
<cassandra.version>2.1.9</cassandra.version>
<jamm.version>0.3.0</jamm.version>
<metrics2.version>2.1.2</metrics2.version>
<metrics3.version>3.0.1</metrics3.version>
<sesame.version>2.7.10</sesame.version>
<slf4j.version>1.7.5</slf4j.version>
<slf4j.version>1.7.12</slf4j.version>
<httpcomponents.version>4.4.1</httpcomponents.version>
<hadoop1.version>1.2.1</hadoop1.version>
<hadoop2.version>2.7.1</hadoop2.version>
Expand All @@ -79,7 +79,7 @@
<hbase100.core.version>1.0.2</hbase100.core.version>
<hbase100.version>${hbase100.core.version}</hbase100.version>
<jackson1.version>1.9.2</jackson1.version>
<jackson2.version>2.3.0</jackson2.version>
<jackson2.version>2.4.4</jackson2.version>
<!-- ES depends on Lucene. This ES dependency can affect the
version used by the titan-lucene module. When updating
the ES version, also consider the version of Lucene, and
Expand All @@ -94,7 +94,7 @@
<asm3.version>3.1</asm3.version>
<asm4.version>4.0</asm4.version>
<zookeeper.version>3.4.6</zookeeper.version>
<jersey.version>1.18.2</jersey.version>
<jersey.version>1.9</jersey.version>
<jna.version>4.0.0</jna.version>
<kuali.s3.wagon.version>1.1.20</kuali.s3.wagon.version>
<jasper.version>5.5.23</jasper.version>
Expand Down Expand Up @@ -123,8 +123,6 @@
<module>titan-hbase-parent</module>
<module>titan-es</module>
<module>titan-lucene</module>
<!-- TODO gremlin-server integration -->
<!-- <module>titan-rexster</module> -->
<module>titan-all</module>
<module>titan-dist</module>
<module>titan-doc</module>
Expand Down Expand Up @@ -845,11 +843,21 @@
<artifactId>jersey-json</artifactId>
<version>${jersey.version}</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-client</artifactId>
<version>${jersey.version}</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-server</artifactId>
<version>${jersey.version}</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-guice</artifactId>
<version>${jersey.version}</version>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-core</artifactId>
Expand Down
Loading