capture &| stream data between browsers w/o intermediary & plugins|3rd party
consequence of a messaging based approach: no longer a need for a single conceptual model to underpin the integration effort
Distributed commit log. good for high volume data processing pipelines and realtime/batch consumers.
backward compatibility, forward compatibility, and full compatibility
a MapReduce implementation that dramatically eases continuous upload of data into Hadoop clusters
High throughput;Low Latency No buffer management If serialized: no locking/batching
2 storage engines lockless in-memory & disk w/ 2level Btree transactions+replication w/ WAL for CD
ultra-fast transaction throughput at low latencies commonly used with a DW (or Hadoop) to optimize OLTP throughput & analytic queries/repor
Embedded persistent KVS Not distributed No failover Not highly-available, if machine dies you lose your data
messaging server persisting to RethinkDB
facilitates querying and managing large datasets residing in distributed storage
compute streams off other data streams in real-time as events occurred. useful in areas w/ lots of complex transformations
computational engine
Big data
abstract
-
Replication, indexes, caching..Kafka
martin.kleppmann.com
-
WebRTC
developer.mozilla.org
-
Column-oriented DBMS
en.wikipedia.org
-
Shared nothing architecture
en.wikipedia.org
-
CAP theorem
en.wikipedia.org
-
About CAP
codahale.com
-
8 more
abstractcapture &| stream data between browsers w/o intermediary & plugins|3rd party
articles
-
How Nonprofits are Using Data to Improve Lives
enterprise.import.io
-
How To Use Celery with RabbitMQ to Queue Tasks on an Ubuntu VPS | DigitalOcean
digitalocean.com
-
21 NoSQL Innovators To Look For In 2020 - Wikibon
wikibon.org
-
onurakpolat/awesome-bigdata
github.com
-
Paper Facebook Wormhole
PDF
-
Linkedin’s Scalable Consistent Change Data Capture Platform
PDF
-
4 more
articlesMOM
consequence of a messaging based approach: no longer a need for a single conceptual model to underpin the integration effort
-
nanomsg
nanomsg.org
-
Asynchronous Communication in SOA
saipraveenblog.wordpress.com
-
Redis based message queue
restmq.com
-
Disque – a distributed message broker | Hacker News
news.ycombinator.com
MOM consequence of a messaging based approach: no longer a need for a single conceptual model to underpin the integration effortKafka
Distributed commit log. good for high volume data processing pipelines and realtime/batch consumers.
-
Kafka rx slides
slides.com
-
cjdev/kafka-rx
github.com
-
Apache Kafka for Beginners
blog.cloudera.com
-
Clients - Apache Kafka - Apache Software Foundation
cwiki.apache.org
-
MODELING SPECIFIC DATA TYPES IN KAFKA
blog.confluent.io
-
Documentation
kafka.apache.org
-
4 more
Kafka Distributed commit log. good for high volume data processing pipelines and realtime/batch consumers.Confluent
-
Blog
blog.confluent.io
-
Intro
confluent.io
-
Avro
confluent.io
-
Zookeeper
zookeeper.apache.org
-
Camus
confluent.io
-
Load data into Hadoop with Camus
blog.confluent.io
Confluentbackward compatibility, forward compatibility, and full compatibility
a MapReduce implementation that dramatically eases continuous upload of data into Hadoop clusters
IMDB
High throughput;Low Latency No buffer management If serialized: no locking/batching
-
In-memory database
en.wikipedia.org
-
tarantool.org
tarantool.org
-
MemSQL, VoltDB
informationweek.com
-
Comparison
kkovacs.eu
IMDB High throughput;Low Latency No buffer management If serialized: no locking/batching2 storage engines lockless in-memory & disk w/ 2level Btree transactions+replication w/ WAL for CD
Datascript
-
tonsky/datascript
github.com
DatascriptVoltDB
ultra-fast transaction throughput at low latencies commonly used with a DW (or Hadoop) to optimize OLTP throughput & analytic queries/repor
-
Blog
voltdb.com
-
Comparing VoltDB to Postgres
pgsnake.blogspot.fr
VoltDB ultra-fast transaction throughput at low latencies commonly used with a DW (or Hadoop) to optimize OLTP throughput & analytic queries/reporMemSQL
-
MemSQL: The Fastest In-Memory Database
memsql.com
MemSQLRocksDB
Embedded persistent KVS Not distributed No failover Not highly-available, if machine dies you lose your data
-
Blog
rocksdb.org
-
MongoDB + RocksDB
blog.parse.com
-
Parse announcement
code.facebook.com
-
with osquery
code.facebook.com
RocksDB Embedded persistent KVS Not distributed No failover Not highly-available, if machine dies you lose your datadeepstream
messaging server persisting to RethinkDB
-
A Scalable Server for Realtime Web Apps
deepstream.io
-
configure deepstream to use the RethinkDB storage connector
rethinkdb.com
-
Offline support
github.com
deepstream messaging server persisting to RethinkDBMagnet
-
Magnet Message
magnet.com
MagnetAvro
-
Effective Avro §
blog.confluent.io
-
The problem of managing schemas
radar.oreilly.com
-
Schema evolution in Avro, Protocol Buffers and Thrift
martin.kleppmann.com
-
Tools for implementing Avro with Kafka
confluent.io
AvroHive
facilitates querying and managing large datasets residing in distributed storage
-
Apache Hive
hive.apache.org
Hive facilitates querying and managing large datasets residing in distributed storageSmartStack
-
airbnb/smartstack-cookbook
github.com
-
nerds.airbnb.com
nerds.airbnb.com
-
SmartStack vs. Consul
igor.moomers.org
SmartStackConsul
-
consul.io
consul.io
Consulstream processing systems
compute streams off other data streams in real-time as events occurred. useful in areas w/ lots of complex transformations
-
Storm
storm.apache.org
-
Spark
spark.apache.org
-
Samza
samza.apache.org
-
Samza vs Storm
samza.apache.org
-
Samza vs Storm
stackoverflow.com
-
Samza vs Spark
samza.apache.org
-
1 more
stream processing systems compute streams off other data streams in real-time as events occurred. useful in areas w/ lots of complex transformationsHDFS
-
HDFS vs. Other Storage Tech: Benefits and Advantages of HDFS
hortonworks.com
HDFSHadoop
computational engine
-
Six Super-Scale Hadoop Deployments
datanami.com
-
Hadoop pioneered this approach.
blog.confluent.io
Hadoop computational engine