DISTRIBUTED DATA-DRIVEN COMPUTING NETWORK Myrinet 10G-ethernet http://www.myri.

com

Shared-Nothing Architecture IP broadcasting / multicasting map-reduce TOPOLOGIES FreeBSD C language Network-link layer Distributed system PostgreSQL TCP/IP broadcasting / multicasting SOurce Code Control (git, cvs, gitorious, github ?) CODE MIGRATION Spreading codes across data nodes, this code is in Stored Procedure context SP that have to spread is that SP for index processing supported platform-independent code In the first step we build PgDTS using simple, intuitive, may perform poorer, bu t working mechanisms and algorithms OTHER PROJECTS Rocks http://www.rocksclusters.org/wordpress/ data distribution policy and mechanism provide mechanisms to support data distribution policy XID Generation method#1 using agreement among backend nodes method#2 using a special node or a group of nodes called XID_Generators How when a XID_Generator failed? use more than one XID_Generator node forming XID_Generator sub-c luster when there is only one XID_Generator then the recovery method is to fallback to method#1 HOTEST TOPIC: High-Availability Scalability Single-Point-of-Failure (SPOF) Distributed Database Management System (DDBMS) talk about "node" or in term of "node" How we set the goals Which one is increasing when adding more nodes to the cluster? H-A ^

. What is to be distributed? disk data and RAM data? achieve H-A Shadowing a intermediary-node intermediary-node shadow can be created by giving one or more intermedia ry-node handling the same group of data-nodes.x Network.+------------------> Scalability H-A vs SPOF Scalability vs Bottleneck Crash Recovery ..168.x. Multiaddress: one MAC address served one or more PDTS address purpose: shadowing a intermediate-node Shadowing a data-node: a shadow data-node can be made if there is one or more data-node that ha s access to same storage. Where shadow will be implemented? at PDTS Network or at higher level: at Message Passing Protocol or at more higher level: at Postgres backend/data/intermediary-node serv er .g. how? PDTS network ( as opposed to IP network ) mapping: IP address <-> PDTS address <-> e. another solution is connecting two or more data-node using dedicated net work and one data-node acts as a master node and other nodes acts as shadow nodes.peer# rinet/others data-link address MAC address of ethernet/my PDTS network is logical network Its map to physical network (datalink layer network): 1-to-1 : one network on one physical network such that one NIC assigned to only one network and there is only one ne twork on that physical network m-to-1 : many network using one physical network 1-to-m : one network span to more than one physical network one network assigned to more than one NIC such that sending data to a network (not to a node) are broadcasted on assigned physical networks m-to-m : there is one physical network serving more than one network .. All activities on master node ditributed to shadow nodes. Utilizes multihoming feature of PDTS Netwo rk. 192.node..

w/wo server terminate) * Not-operational H-A 1 2 3 4 degrees: running more than one instance on different node ( need load balancing ) standby backup ( need failover solution ) off backup ( need turning on the off backup ) no backup ( that is SPOF ) Automatic Failover of datanode: * "a node" could be served by one or more machine * when there is more than one machine on a node. Resource discovery -> active node discovery -> datalink multicast Pathwork: -concept -coding -analyzes -optimization ...Node Status: * Operational * Live-at-zero: ready-to-die * Live-at-zero: shadowing * Expected-to-die (with/without auto-shutdown. those machine would act like it where there is only one machine * the purpose of automatic failover is .