Friday, October 24, 2014



Comparing and analyzing two symposium talks from industry leaders of open source technology in the world of IT.

“How Companies Use NoSQL Open-Source Technologies like Couchbase” – Don Pinto, Product Marketing Manager (Couchbase)

Don Pinto (M.Sc., Computer Science – University of Toronto) has previously worked as the director of product management at GridCentric Inc. (now owned by Google), with additional experience as a SQL Server/Azure program manager at Microsoft.

The Problem – “There is lots and lots of data. More users than ever before and the interactive complexity of apps.”

“Consumers & Employees Demand Highly Responsive Apps.”

Old relational stores had a lack of flexibility/rigid and an inability to scale out data easily. Those 2 factors along with performance costs comprised of some of the most popular client complaints when using such tools.

This calls for a new backend technology – NoSQL.
So what could be a candidate to be the right tool?

·        The JSON Data Model Fits today’s developer needs better
o   Aggregates & denormalizes data into single document (Document data model).
o   Handles structured & unstructured data equally well (Docs are distributed evenly across servers)
o   Inferred schema requires no migration
o   JSON rapidly being adopted
o   Access both JSON and binary data as key-value pairs

·        RDBMS needs a bigger, more expensive server to scale up architecture.
·        Auto-sharding vs. Manual sharding (data partitioning).
·        Open-source obviously implies lower costs for maintenance, and usage.

·        Availability – Relational systems use clustering as an afterthought.
o   RDBMS must take database down for “maintenance windows”
o   They struggle to support XDCR (Cross data center replication) across many DCs (data centers).

·        Couchbase offers a full range of Data Management solutions
o   High Availability Cache (Zero downtime administration and upgrades)
§  Always-on functionality for a potentially global user base
§  Couchbase Lite – Mobile application that includes a sync gateway for mobile work to update server.
o   Consistet High Performance
§  Built-in object level cache
§  Fine grained locking
§  Hash partitioning to uniformly distribute data across the cluster
o   Elastic Scalability –
§  Shared-nothing architecture with a single node type
§  True XDCR
§  Push button scale-out

Some use cases for NoSQL:
·        Heavily accessed web landing pages
·        Application objects
·        Popular search query results
·        Session values or cookies (key-value pair store), eg. Shopping carts, flights selected, etc.
·        User profile with a unique ID, user settings/preferences, user application state
·        Content metadata stores (articles, text)

Some known users of Couchbase:
·        Orbitz – 11 clusters with a total of 100 nodes
o   3 TB of data with over 430 million objects
·        McGaw-Hill Education Labs – Content and metadata stores
o   “Building a self-adapting, interactive learning portal”
o   Scale to millions of learners
o   Self-adapt via usage data
·        AOL – Ad-targeting using a Couchbase server
o   40 milliseconds to respond with the decision.
o   User profiles, real time campaign stats
o   Affiliate, event, profile, and campaign data

Mike Hoye, Engineer Community Manager at Mozilla: “Social Engineering – Building Communities With, And On, Purpose”

·       "Process reifies and reinforces values"
·        If you don’t measure it, don’t pretend you care about it
·        The ROI on timely gratitude is ridiculous.
·        Karma is a wheel (courtesy, saying thank you for the things you are given)

“The way you conduct and execute your process is a direct reflection of your values.”

Access, Engagement, Retention - “If you let a patch sit for a week from a first contributor, it is very unlikely you will see them contribute again.”

·        What is in front of a user, if they want to commit a one-line change to your project?
·        The importance of comprehensive documentation
·        The “miraculous” benefits of an easy-to-set-up build environment

·         * “Throw the little fish back in the water for the new entrants to the game.”
·         *  Label good first bugs for beginner contributors and give a concise, yet thorough explanation of how to go about fixing them

·         * “A single toxic contributor can harm an entire community. If people feel unequally welcome in the     community, many will inevitably shy away from it. Don’t be a jerk and don’t let others become           jerks.”
        * Gratitude. Saying thank you. “This bug and your fix matters.” Telling them what to do next.

Why does open source matter? Am I the first person to have this problem?

-        Mythology: Is what is in the absence of real numbers and real data about what works and what doesn’t. Stories are powerful and get into and stay in people’s heads.

-        Open source is meritocratic (we need to stop talking about ourselves like we’re “magic”)

-        Diminishing returns: After 3 sets of eyes looking at a piece of code to figure out a bug, the rest are wasting their time…

-        Strong FSOSS and FSOSS-like communities grow organically

What are the most basic, fundamental things we need to embark upon an open source project?

1)      Source Control
2)      Issue Tracking
3)      Automatic Testing

Do you care?
“There is no regression test for somebody’s mood.”

“Your community is an API to your software.”
What is the state of our community? What problems does our community have?
Are we actively fostering community engagement?

“Have a code of conduct. Have a code of conduct. Have a code of conduct.”

               Choosing this particular pair of talks yielded a very comprehensive picture on the open-source process, since one focused on the community build and project goal philosophies – the preliminary prerequisites - and the other presented the benefits of a completed and deployed product that was a result of many of the same ideals and principles being implemented. There was a clear testament to not only an implied agreement of the two speakers’ points and values, but a symbiosis as well. The focuses and comparisons were of a completely different nature and centered on relatively unrelated processes due to the subject matter, but were ultimately two sides of the same coin.

The open-source paradigm and its implications have not changed for me so much as they had flourished over time. Admiring the philosophies behind open source came naturally, but I was initially confused and doubtful about the financial feasibility of companies and institutions that fully embrace free software and open source processes and values with their intellectual property. The state of this maturing industry after nearly a quarter-century of existence is clear evidence of its resounding and continued success, mainly by awareness of giants like Microsoft or Google investing in more open source venture startups and releasing more open source code and products, as well as witnessing and contributing to companies like RedHat and Mozilla - which are almost entirely based off of open source ideologies in every one of its aspects - rise over the years to become fortune 500 companies.

No comments:

Post a Comment