P. 1
Reddit Architecture and Extension Proposal

Reddit Architecture and Extension Proposal

Views: 190|Likes:
Published by footeb
Uploaded from Google Docs
Uploaded from Google Docs

More info:

Published by: footeb on Dec 14, 2012
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as DOCX, PDF, TXT or read online from Scribd
See more
See less





This alternative was an attempt at making the link recommendations as updated as possible,
while providing better performance than alternative 1. To do that we will recalculate the link
recommendations each time a new vote comes in and store the pre-computed results in
Cassandra. As shown in figure A-3, we will be using the Closest Neighbor batch
process(mentioned in the accepted proposal), which will find the nearest neighbor accounts for


a user. Whenever there is an up vote on a link by the nearest neighbor, the vote handler will call
recommendation engine. The recommendation engine will look into the nearest neighbor, link
and vote database, to check if the link is up voted by the nearest neighbor of a user and is not
voted by the user then this link will be recommended to the user. The recommendation engine
will send this link recommendation to the Vote Handler, which in turn will update Cassandra
Cache Chain. Figure A.4 shows how this approach maps to the top-level functional component

This one has the main advantage of more updated recommendation compared to our accepted
proposal. It also has better performance than the alternative 1, because we will calculate the
recommendation each time a user submits vote and not when the user asks for
recommendation (by clicking on the recommendations panel) as in alternative 1.
It has better availability than alternative 1, because even if the recommendation engine goes
down the older link recommendation will always be available to the user. Also the unavailability
of this component would not hurt the availability of the rest of components serving the other
Scalability - The Vote handler servers will have to do additional processing for this alternative,
as the recommendation engine is tightly coupled to the Vote handler functionality. Therefore it
may be harder to scale the application, because vote handler is more likely to hit the
performance elbow. On the other hand we will be adding new recommendation batch servers
for the recommendation engine, which is going to be easier to scale. So we would say the
scalability is going to be neutral for this alternative.

However, this alternative has two disadvantages:
1. Evolvability: Since we are modifying existing Vote handler component, adding a new
functionality is not going to be easy. Since the recommendation functionality will be tightly
coupled with the existing vote handler logic, it will be much more time consuming and
technically challenging to implement new algorithms and fix any bugs.
2. Cost: Since this alternative is more complex and more time consuming. We will be spending
more on the developers to implement this functionality than our accepted proposal.

Even though this alternative has several benefits, it is going to be more time consuming and
cost us more than the accepted proposal. This is the main reason for not going with this


Figure A.3 - The incremental update approach dataflow diagram


Figure A.4 - Top-level functional view modified for alternative #2

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->