-
Website
http://www.educer.org/ -
Original page
http://www.educer.org/2009/09/29/on-pubsubhubbub-part-2-get-with-it-push-youre-supposed-to-be-realtime/ -
Subscribe
All Comments -
Community
-
Top Commenters
-
jeremy
7 comments · 8 points
-
sull
1 comment · 3 points
-
brianjesse
1 comment · 4 points
-
JacopoGio
1 comment · 2 points
-
Mark Cooper
1 comment · 2 points
-
-
Popular Threads
I guess there are rough edges around all new servers.
The rpc.rsscloud.org server you are probably referring to is maintained by Dave with no promise of uptime (correct me if I'm wrong, obviously) as a place to test your implementation. It is possible that the server can be rebooted for changes at any time. No big company is providing a constant connection here.
The WP plugin for rssCloud creates a server on every blog that installs it. Problems have been few and far between with this.
The pubsubhubbub server hosted by Google has been pushed into every blogger feed and implemented heavily in FeedBurner feeds by Google. It pushes updates from multiple IPs, indicating a network of hubs that they are using to guarantee uptime. By doing this, they have told me that they are ready for real time.
A quick browse of the source will show you that it's a simple app. There's no "network of hubs".
All App Engine applications use multiple IP addresses for connections. The number if IP addresses used by these apps should not be used to infer a guarantee of uptime.
By the way, an application running on App Engine can not subscribe to rssCloud because these applications do not use the same IP address for inbound and outbound connections.
Got it, no multiple hubs. But by using multiple IPs from the AppEngine network, a distributed network and uptime is inferred.
My point is - if you're going to push yourself into millions of feeds as a solution, then you are ready to go.
BTW, I'm not arguing against PubSubHubbub here. I want it to work. That's all that I'm getting at.
You should not infer that applications running on App Engine have a guarantee of distribution or uptime.
Recent blog posts from the App Engine team indicate that applications run in a single data center at a time. The apps are single homed, not distributed across multiple data centers.
Plugging the hub into many feeds seems like a great way to bootstrap and test the realtime cloud, even if there are some rough edges.
Plugging the hub into many feeds is a great way to test. But, two things.
1) When you're big like Google and you announce the implementation, the perception given to me, the developer, is that you're ready.
2) When I, the developer, do start using it, I'll get a perception on how it's working and I'll share it.
If others have details on how it's working for them, I'd love to share examples. I'm having fun working with both rssCloud and PubSubHubBub and coding > arguing.
There are rough edges on all the work that's going on. We should cut everybody some slack including Brett Slatkin.
http://adsenseforfeeds.blogspot.com/2009/07/wha... - Google announces PubSubHubBub support in FeedBurner feeds for AdSense, notifying a "Google-run Hub".
http://googlereader.blogspot.com/2009/08/pubsub... - Google announces PubSubHubBub support for shared items in Reader.
http://buzz.blogger.com/2009/08/blogger-joins-h... - "All blog post feeds now contain a "hub" element, and will ping Google's hub on every post update."
http://googlecode.blogspot.com/2009/08/towards-... - "we have gone a step further and added PubSubHubbub support to Google Alerts."
I'm not trying to push any blame on Brett. Google owns this now and should help any issues along. I've written about some of the issues I'm seeing with Google's PubSubHubBub Hub.
Constructive discussion about the issues I've been seeing is definitely welcome.
Otherwise, what is your subscriber's average latency for handling notifications? The reference Hub is defensive about delivering to subscribers that track many feeds and are slow to respond. So, if you're taking over 5 seconds, you may see slowdowns. It's best to process incoming notifications asynchronously if you can.
This comment stream aside, the original post was written more as my perception than science. Rereading, it's a pretty unorganized perception. Ahh, late nights. There's more too the rambling, but if you come away with one thing from the above, it's that I don't see the FeedBurner stuff being real time as I thought it would.
I'll be grinding through the data more closely as the week goes on. The initial conclusions are based on a snapshot look at the initial 24 hours or so of use.
I can't answer to the latency yet, but I also can't imagine it being too high. Not a perfect answer, I know, but the server is on Amazon's EC2 and overall latency (network and system) seems low. Almost the only traffic coming in is from rssCloud and PubSubHubBub notifications. From watching Dave's rssCloud log (light pings), the time posted is usually less than .300 seconds.
The feeds that I've noticed the most issues with are from FeedBurner. My guess is that the delay and re-pushes are due to the ping scheduling between publisher->FeedBurner->PuSH. Once publishers start pinging directly to the hub instead of relying on a middle man, I would think that these issues clear up. See previous post directed at publishers.
The feeds that I've noticed the best response with are from Google Reader shared items. Again, perception, but things seem to run pretty smoothly here.
My generated twitter-link feeds seem to be sporadic when done in quick succession. @mmastrac pointed out after I posted last night that this could be a "race" between the feed writing and the hub reading if things are happening quickly enough. I still need to explore that.
You've given me a bunch of stuff to look at. I'll do what I can to start logging and parsing all of it and then provide the results. Hopefully I find a few problems with my code to fix along the way. :)