InfluxDB CEO and post author here. I'd love to hear feedback and answer any ques...

atombender · on June 4, 2015

I have a question that's not about clustering.

We're using Influx for analytics, basically storing events with lots of detailed metadata and data, allowing us to ad-hoc aggregations like "select count(*) from viewed_product where product.category = ? and site = ? and user_agent = ? group by time(1d)". We're storing a lot of nested data; we are in fact merging in original data models so that we have historical event data even if the original data is deleted.

But reading about the changes in 0.9, I've noticed that you're moving towards a more fine-grained data model tuned for collecting measurements, with typically a single value per item + metadata in the form of tags, a lot like Prometheus' data model. This would seemingly make it even less appropriate for our use case, despite the fact that InfluxDB is apparently still intended for analytics. Influx also isn't performing very well at querying our data (compared to ElasticSearch and PostgreSQL), though I have been hoping this is because Influx being young and unoptimized and (as far as I understand) doesn't index any part of the value, only the tags.

Can you shed some light on this? Should we move off InfluxDB? Or will the new tag-based data model improve our use case?

pauldix · on June 4, 2015

The new model combines tags, which are indexed, with fields, which are not indexed. A measurement can have up to 255 different fields, all of which can be written for a single data point.

As we push things forward we'll be adding more analytics queries in. But for the time being it's more aptly suited for metrics and sensor data.

With 0.9.0 you should be able to use a combination of fields and tags to get some fairly sophisticated queries. Where clauses also work on both tags and field values.

It might be easier to discuss on the InfluxDB Google Group. I'd like to hear more about your specific use case, the data you're writing in, and the kinds of questions you're asking of that data.

atombender · on June 4, 2015

Interesting. So we'd see a lot of benefit from the new model.

Except our documents may easily have more than 255 fields. We would never to query that much, mind you; but we don't know ahead of time what we'd need to query.

A nice thing about InfluxDB compared to other solutions is that it's schemaless and can still aggregate data pretty fast. ElasticSearch is much faster (than 0.8), but it has a big problem with indexes needing to be predefined (auto-creating mappings is highly flawed).

I might drop by the Google group with more questions.

fizx · on June 4, 2015

Why is auto-creating mappings flawed?

atombender · on June 4, 2015

Because ElasticSearch just assigns index settings based on the first value it gets.

For trivial values such as numbers, that's usually okay, unless it happens that the field is polymorphic (not a good schema design, of course).

But it doesn't know how to set up any of the mappings; it doesn't know whether something is a full-text field (which often requires analyzers) or an atomic, enum-type string.

It also doesn't know about dates. If you index pure JSON documents, it will simply store them as strings.

This would all have been a non-problem if updating mappings in ES were simple, but it's not. A mapping is generally append-only; if you want change the type of a mapping, or its analyzer, or most other settings, you have to create a new index and repopulate it. Schema migrations in ES are a lot more painful than, say, PostgreSQL.

andyl · on June 4, 2015

Paul - can you explain the difference between analytics and metrics/sensor queries??

pauldix · on June 4, 2015

Analytics are usually based on aggregating discreet events. It's based on irregular time series. Metrics and sensor data are usually regular time series. That is, series where you have samples taken at regular intervals of time, like once ever 10 seconds.

When it comes to querying regular time series, don't have a huge number of points to aggregate across a single series, where analytics can have millions in a single series that you're looking at.

Then there are other types of queries that you need in analytics that don't make sense in metrics like getting sessions and calculating funnels.

InfluxDB is still useful for analytics today, it's just that in some instances it's more basic and crude compared to what you can do with things like MixPanel.

mattLummus · on June 17, 2015

Is it possible that a regular time series could have better read performance (particularly in aggregations) vs an irregular one due to determinism/randomness - or is that irrelevant to the underlying implementation?

AnIrishDuck · on June 3, 2015

Looks interesting. This answers a lot of questions about primary data, but I'm interested to know how continuous queries [1] are handled.

If they're stored like any other data, the "inserts are far more likely than updates" assumption doesn't really make sense. For example, what if you have a continuous query grouped by year?

PS. "challenges" is mis-spelled in the last sentence.

1. http://influxdb.com/docs/v0.8/api/continuous_queries.html

pauldix · on June 4, 2015

Each of those continuous query data points would be an insert. I guess if you're recalculating it frequently, those new points would be updates, however, under non-failure conditions that won't be a problem.

If failure conditions occur, hinted handoff should take care of it, or the next CQ run that recomputes the result should make it correct after the failure condition is resolved.

Finally, those aren't contentious updates. You're not talking about multiple clients trying to make different updates to the exact same point, which is really what I mean when I talk about not optimizing for updates.

victorhooi · on June 4, 2015

Hi Paul!

I have a question about the new tagging feature in InfluxDB 0.9 - hopefully you can clear up my confusion.

My understanding is that tags are great, as they are indexed and hence quick to search.

However, they are not suitable for situations where you have very high cardinality (> 100,000). Assuming this isn't the case, where else would you use fields over tags?

For example - apart from the indexing and cardinality issue - what are the pros/cons of

   "tags": {
       "host": "example.com",
       "direction": "bytesin"
    }
   "fields": {
       "value": 50
   }

and

   "tags": {
       "host": "example.com",
       "direction": "bytesout"
    }
   "fields": {
       "value": 60
   }

versus

   "tags": {
       "host": "example.com",
       "direction": "bytesin"
    }
   "fields": {
       "bytesin": 50
       "bytesout": 60
   }

Or say you have a logline, and you're parsing various attributes out of it (meaning all the values are quite tightly associated with each other) - would you split them into separate series, each with their own set of tags, or would you store them in a single series with multiple fields?

pauldix · on June 4, 2015

In your example I'd probably have the direction in the measurement name. e.g.

{ "name": "network_bytesin", "tags": {"host" "example.com"}, "fields": {"value": 50} }

For fields, you generally only want to put two pieces of data together in a single point (thus in two fields) if you're always going to be querying them together. Either by pulling the values out, or filtering on a WHERE clause.

Unrelated to this, we've created a line protocol to write data in and it's a much more compact way to show a point:

network_bytesin,host=example.com value=50

Details here: https://github.com/influxdb/influxdb/pull/2696

victorhooi · on June 4, 2015

Aha, fair enough - so if you only sometimes query the values together, you should separate them out into discrete series =).

For a logline, the sort of metrics we'd get would be things like query duration, database, client ID etc. You've often query them together, but you'd also query them separately, so I guess that also makes sense as multiple separate series.

Interesting, that line protocol looks cool and seems like it'd be more efficient on the wire.

Will the drivers (e.g. Python, Go) be updated to take advantage of this new endpoint?

Finally, I see there's also a ticket open for binary protocols:

https://github.com/influxdb/influxdb/issues/139

Do you see that as being still potentially useful?

pauldix · on June 4, 2015

The Go client has already been updated. We'll be updating the Ruby and front end JS client. Hopefully the community will jump on updating the other clients once we release 0.9.0. It's a super simple protocol so I don't imagine it'll take much work.

The binary protocol is probably much less useful now. HTTP + GZip of the line protocol will already saturate what our storage engine can do at this point. In fact, I'm going to close that out right now...

TheLoneWolfling · on June 3, 2015

Than you for explicitly laying out assumptions. Far too much software (and hardware!) doesn't do that, or if they do it's buried.

I do worry about asymmetric failure modes, though. (Node A can talk to node B but not vice-versa)

Also:

> This keeps our overhead for doing conflict resolution incredibly low. No vector clocks are anything needed.

I'm pretty sure that should be "No vector clocks or anything needed."

will_hughes · on June 4, 2015

I don't know if you've done this anywhere, but have you created any guidelines on tuning Influx based on workload?

We took a look at it last year, looked fantastic until we actually gave it more than a couple of metrics - and then it crashed and burned in so many ways. I tried fiddling with sharding and replication with no great success.

This was just before you were planning to get rid of the manual configuration for these things - so perhaps it's improved since then.

Having some idea of "If you want to store {X} metrics with {Y} updates per second for {Z} time, then you'll need at least this hardware configuration" would be great.

As it is, we've stuck with graphite, and we're kinda waiting for you guys to go past 1.0.

pauldix · on June 4, 2015

We don't have those recommendations at the moment. We'll be putting that out over the next few months after we release 0.9.0 and some point releases after that.

Out of curiosity, how many metrics are you tracking in Graphite? What's your sampling interval?

will_hughes · on June 4, 2015

About 3,000, with 1 sample/second.

pauldix · on June 4, 2015

That should be doable if you're batching data together. The next build will actually do this automatically for you if you're writing into the Graphite, CollectD, or UDP inputs. It buffers some updates in memory to batch them in a single write to the underlying storage.

Might be worth another look for you when 0.9.0 comes out :)

schmichael · on June 4, 2015

> Every server in the cluster keeps an in memory copy of this cluster metadata. Each will periodically refresh the entire metadata set to pick up any changes.

How does this avoid nodes doing stale reads from the inmemory copy resulting in each node having a slightly different out-of-date view of the cluster?

pauldix · on June 4, 2015

They could have an out of date view of the cluster. Like which databases exist, or which servers exist.

However, those cases are fine. If a database was created and a server doesn't have a copy of it. When a write comes in it'll be a cache miss so it'll hit the Raft system to get the info.

In the case of a new server joining the cluster, it won't get new data assigned until a shard group gets created and a shard gets assigned to it. When a write comes in if a node doesn't have a shard group for that time range, it'll all out to the Raft system to get the information.

So yes, it's possible for some servers to have a stale view of this cluster metadata, but we work around it by having them request the information on demand and periodically refresh the entire thing.

schmichael · on June 4, 2015

Thanks for the more detailed explanation! Really looking forward to further posts -- not to mention 0.9!

atombender · on June 4, 2015

I believe Raft is supposed to handle this; every modification is a log entry, and every recipient has to ack the log entry, a bit like two-phased commit.

dcb18 · on June 4, 2015

This is incorrect, only reads from the current Raft master are guaranteed to not be stale. In the case of InfluxDB, I think caching is safe because the shard metadata is immutable.

schmichael · on June 4, 2015

Where are you seeing that cluster metadata is immutable? I don't even know how that would work. Surely nodes, databases, shards, users, permissions, etc. all can change?

dcb18 · on June 4, 2015

Yep, you're right.

schmichael · on June 4, 2015

But the section I quoted said that local metadata queries go to the cache which is only updated "periodically."

eternalban · on June 4, 2015

Did you switch your Raft implementation & would you care to share your insights on the different implementations you guys have tried? (If there was a blog post on that, would appreciate the link).

pauldix · on June 4, 2015

This is the third Raft implementation for us. The original was goraft, then we attempted to make our own streaming raft implementation for high throughput.

This time around we realized we should only put very low write throughput stuff through Raft. And we wanted to pick on that has been out for a while and running in production in many places.

So we chose the Hashicorp implementation because we know them and know that it's been in production for a while. Combined with the fact that it can be backed by BoltDB, which we like.

eternalban · on June 4, 2015

Thanks, Paul.

My advice to you is to (a) model in TLA+. Writeup seems fine but, as you note, distribution is non-trivial (since intuition does not sufficiently/effectively inform the mental model). (2) do not rely on monotonic nanosecs from Go runtime. Only usec level precision is guaranteed monotonic. (3) if you don't own the client you are open to byzantine failure.

ngrilly · on June 5, 2015

> do not rely on monotonic nanosecs from Go runtime. Only usec level precision is guaranteed monotonic.

Interesting! Source? Any link about this?

eternalban · on June 13, 2015

http://play.golang.org/p/Hs7gPYTCgi

The issue is not Go (though imo they should update the doc to share the esoterica): http://stackoverflow.com/questions/4801122/how-to-stop-time-...

ngrilly · on June 14, 2015

Thanks for the links. They are useful. I understand now.

derefr · on June 3, 2015

I would ask, what makes the resulting design different from, say, Datomic?

coolsunglasses · on June 3, 2015

Well, Datomic actually says they're CP, so that's one difference. (Linearizability of transactions)