Etech > The Data Dump > Fun With Graphs and Charts

David Sifry – state of the blogosphere
Mar 2006 over 100,000 blogs created daily
new weblog every second
50% of new bloggers still posting 3 months later
10% of all blogs update weekly or more
about 9% of new blogs are spam w spikes as high as 25%
60% pings from known spam sources
technorati blocks most of these now
daily posting volume
1.2 million legitimate posts/day
about 50,000 postings per hour
Blogs vs mainstream media, some blogs are up there now replacing traditional media
language breakdown – jap 41% – english 28% – chinese 14%, others all 4 or less
Japanese posts tend to be shorter so more posts
almost one half of blog posts use tags or categories
over 81 million tagged posts, growing 400k/day
about 24% of all posts use rel=”tag” microformat (only created 1 yr 3 months ago)
More than search – it’s about exposing community and context
Eric ? – CTO Feedburner
showing us feedstorm – visualisation of hits after launch, going from a few drops to lots punctuated by events like launch of iTunes podcasts which makes the screen go made, a quick drill of what the visual is representing.
position is random
size is size of feed/circulation
colours are types (text vs podcast ex)
200,000 feeds managed now
Adam Messinger – Gauntlet Systems
next gen continuous integration system
no more broken builds or smoke tests
comprehensive analytics
Lotka’s Law – in 1926 observed 80/20 rules relationship in scientific productivity
posit is that this applies to development, as in the few most productive devlelopers account for majority of code
Three open source projects that have checked into their system
ActiveMQ – two guys do 80% work, the remaining 10 or so do the rest
Lucene – flatter curve, three of ten do majority of the work (but it’s mature product)
Hibernate EjB3 – most work done by two with only 5 others doing the rest (commercial)
true that few developers do most of the work
Andy Edmonds – Windows Live Search
Data Dump, ETech 06
year in review – the attention of the world is left in traces of the internet
most popular places looked for in virtual earth – 1000 top queries
a visual of web queries
Roger Magoulas – O’Reilly
AJAX/Ruby trends – normalisation – how to spot emerging trends
Lots of graphs tracking trends through number of Google pages referencing the term
Ian Kallen – Technorati
Web pages lie but the stats don’t
The Metrics of the Dark Underbelly
blog spam facts from the guy building Technorati while it flies…
They tap the stream of data to find spam
They expose domains that are known spammers
looking for concentration of blogs around things like clusters on a domain and Adsense ID for example
again lots of stats on blogspot , but not because of the service but because it is free
? –
a run through of setting up a vault for your clickstream, nothing new here
the sessions run over and chaos as there is supposed to be another session starting and people are leaving and arriving and pushing past
David Hornik – August Capital
Sept 2005 to + Feb 2006
A funny skit of email breakdown
most common mail 976 = Introduction
themes around taking breaks and not doing work, nice break from the norm but to be truthful it’s an indication of the old spirit of ETech and just a bit of harmless fun

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: