The area of quantitative measurement is also seeing a number of new initiatives. The largest trend is what Andrew Montalenti from the analytics company Parse.ly referred to as the “democratization of the data pipeline,” where open source tools are maturing to the point that running your own analytics collection is becoming much easier. This development is notable as it opens the door for direct ownership over analytics information, as well as a lower barrier to entry for custom solutions. That is to say, if a company isn’t happy with the speed, interface, or flexibility of Google Analytics, it could more easily build its own in-house platform. This task is no small undertaking to be sure, but new advances bring it within the realm of possibility. Two projects in this space, Snowplow and Piwik, are worth mentioning.
Fairly new, Snowplow is an open source project that allows users to record user events and store the data on their own infrastructure.38source “data pipeline” and gives users the real-time speed of something like Chartbeat with the quantity of time-series data that Google Analytics provides. For most of what Google Analytics records, users must wait roughly 24 hours for that data to become available.
The Guardian started using Snowplow in early 2015 for the analytics on its Soulmates and membership pages. As opposed to Google Analytics, which tends to look at the page view as the atomic unit of consumption, Snowplow’s event-based system makes it easier to track user behavior and attach metadata to each action, said Dominic Kendrick, a software engineer at The Guardian. He also appreciates that it provides this data within five minutes of any user action. “The speed and control you have over what is recorded is the biggest thing, because innovation is limited by the speed of the software you implement. If you use a third party, you’re limited to that schedule,” Kendrick said. “Three years ago no one was doing this, but now you have options.”
Importantly, Snowplow concerns itself with efficiently recording and storing event-level interactions with a high degree of customization options—it does not come with a visual dashboard out of the box. Advanced users will see this as a benefit since it means they can create custom visualizations that answer specific questions their newsrooms might have. For others, however, it might feel they’re getting a mere bicycle frame—albeit a robust, free, and versatile one—when what they had in mind was something they could ride out of the store.
Which system makes practical sense will differ based on the resources an organization devotes to analytics, but indeed growth and greater adoption in this area show promise for future iterations of NewsLynx-like systems. Snowplow’s website keeps an updated list of companies currently using the system.39
Similar to Snowplow, Piwik is another open source analytics suite.40provides multiple dashboard interfaces for viewing analytics results. The largest implementation of Piwik we are aware of is for use at OpenStreetMap (OSM), a kind of Wikipedia for mapping the world that relies on open source, community-created mapping data.41 Eric Brelsford, a developer at the nonprofit 596 Acres42Pratt Institute, uses Piwik regularly. “We wanted just what Google Analytics does but in an open source way,” Brelsford said. “It also did a great job of importing our raw traffic data from our server logs so we could see our traffic from even before we had Piwik installed.”
While a number of WordPress plugins exist for CMS integration,43newsrooms we spoke to had close to no awareness of Piwik’s existence. Vendor-solutions still dominate the field of analytics, but as mentioned above the recent maturation and further testing of these open source tools at scale could change that dynamic in the future.