Just found one issue in DataStax 4.0 today: if we delete the data by ID from Solr, DataStax will remove all fields, but leave its ID field.
For example, we have ID:1 in Solr and Cassandra table, then run reuqest:
http://localhost:8983/solr/ckeyspace.tablename/update?commit=true&stream.body=1
Datastax will delete the data from Solr, but remove all fields from Cassandra(all fields has default value, null or false).
But if we deleteByQuery in Solr:id:1
, Datastax will delete data from Cassandra.For example, we have ID:1 in Solr and Cassandra table, then run reuqest:
http://localhost:8983/solr/ckeyspace.tablename/update?commit=true&stream.body=
Datastax will delete the data from Solr, but remove all fields from Cassandra(all fields has default value, null or false).
But if we deleteByQuery in Solr:
Hacking
Trying to figure out why this happens. so I use Java Decompiler: JD-GUI to decompile DataStax code, create eclipse project from it, then change cassandra script in dse-4.0.5/resources/cassandra/bin:
JVM_OPTS="$JVM_OPTS -agentlib:jdwp=transport=dt_socket,address=7777,server=y,suspend=n"
exec $NUMACTL "$JAVA" $JVM_OPTS $cassandra_parms -cp "$CLASSPATH" $props "$class"
Then restart DataStax: ./dse cassandra -s -f
This will start DataStax in remote debug mode.
Then run Solr deleteById command, add break point at DirectUpdateHandler2 and CassandraDirectUpdateHandler.
Follow the code, I find out that DataStax will call the following delete command in deleteById case: delete field1, fieldn from table where id=1, this will just delete all fields(except id field).
But in case of delete by query, DataStax will call: delete from table where id=1 which will delete the whole data.
The root cause of the issue is at: Cql3CassandraRowWriter.buildDeleteStatement: it does something unnecessary: get all columns from Casssdran and then build the delete field1... fieldn commands.
DataStax fixed this problem in 4.7:
4.0.5 code from com.datastax.bdp.search.solr.Cql3CassandraRowWriter:
public void deleteById(SolrQueryRequest request, String key)
throws IOException
{
String cqlDeleteStatement = buildDeleteStatement(key);
doDeletes(request, Arrays.asList(new String[] { cqlDeleteStatement }));
}
private String buildDeleteStatement(String key)
throws IOException
{
CFMetaData cfMetaData = this.columnFamilyStore.metadata;
String compositeKeyClause = Cql3Utils.createKeyClauseFromSolrKey(cfMetaData.getKeyValidator(), cfMetaData.getCfDef(), key);
List columnNameArray = new ArrayList();
for (CFDefinition.Name name : cfMetaData.getCfDef().regularColumns()) {
columnNameArray.add("\"" + name.toString() + "\"");
}
String delete = "DELETE %s FROM \"%s\".\"%s\" WHERE %s";
return String.format(delete, new Object[] { commaJoiner.join(columnNameArray), this.coreInfo.keySpace, this.coreInfo.columnFamily, compositeKeyClause });
}
4.7 code from com.datastax.bdp.search.solr.Cql3CassandraRowWriter:
private String Cql3CassandraRowWriter.buildDeleteStatement(String key)
throws IOException
{
CFMetaData cfMetaData = this.columnFamilyStore.metadata;
String compositeKeyClause = Cql3Utils.createKeyClauseFromSolrKey(cfMetaData, key);
String delete = "DELETE FROM \"%s\".\"%s\" WHERE %s";
return String.format(delete, new Object[] { this.coreInfo.keySpace, this.coreInfo.columnFamily, compositeKeyClause });
}
So now, we may choose to upgrade to DataStax 4.7, or have to change the code to use deleteByQuery instead of deleteById.Happy hacking...