You are on page 1of 6
ResearchGate Study on Implementation and Impact of Google Hacking in Internet Security 2 2355 e Some th autora hi publeston areal wer on hee elated projets Study on Implementation and Impact of Google Hacking in Internet Security Muharman Lubis’, Nurul Ibtisam binti Yaacob’, Hafizah binti Reh® and Montadzah Ambag Abdulghani* International Islamic University of Malaysia (HUM) muharmanlubis @gmailcom, ‘nibisam@gmailcom, "ta_hafizBS@yshoo.com, ‘mon_ebdulgani@yahoo.com ABSTRACT - As the number of websites and amount of information has increased, madem lives rely. more on search ‘engines 10 scoop up relevant piece of information out of Information Sea. In response to rapidly growing amount of information on the web space, major search engine companies such as Google, Yahoo, and MSN crawl web servers and index crawled information more frequently and thoroughly on te global level. Furthermore, to stay on top of the competitive search engine market, they diligently improve search algoritim and endeavor to provide Internet users with easy-to-use search interface. Due to this diligent and competing effort of search engine compat Internet users can Geely access billions of pages of information regardless of time and space constraints with a simple typing and clicking, Google Hacking uses the Google search engine to locate Sensitive information or to find vulnerabilities that may be ‘exploited, This paper evaluates how much effor it takes to get Google Hacking to work and how serious the threat of Google Hacking is. The paper discusses the implementation and impact of ‘Google hacking in Intemet security, Index Terms ~ Google hacking, Internet security, Google implementation, impact L INTRODUCTION ‘The idea of hacking might be conjured up stylized images of electronic vandalism, espionage, dyed hair and body piercing but the essential none other than that. Most people associated hacking with breaking the law, claimed all those who engage in hacking activities to be criminals. Indeed, there are people out there who use hacking techniques to break the law but hacking is not really about that [9], the hacking is a way of understanding what is possible, sensible and ethical in the twenty-first century by slressed this embedded towards our life because the hack needs a social and cultural context [10], understand what hacking is, we need to know the difference between hacking and cracking [11]. The definition of hacking somehow change with cracking because the media role and thoughtless of some people. All these hacking activities exist within a set of communal relations that each ‘of them expresses a different aspect of hacking, Recently, Hackers have been divided into three ‘categories that are black hats, white hats and grey hats, which have been referred as malicious hackers, ethical hackers and ambiguous hackers correspondingly (17]. Black hhat hackers are people who hack computing systems for their own benefit or the one who broaks into systems illegally for personal gain, notoriety or other less-thas legitimate purposes [I4][15], for example, they may hack into an online store’s computer system and steal credit card ‘numbers stored in it while white hat hackers are the one who wrote and tested open source software, worked for In order to comporations or hired by companies to help them beef up their security, worked for the government to help catch and prosecute black hat hackers and otherwise use their hacking skills for noble and legal purposes, ‘There are also hackers who refer themselves as gray hat hackers that they are operating somewhere between the wo primary groups [14]. Gray hat hackers might break the law ‘but they consider themselves to have a noble purpose in doing so. For example, they might erack systems without authorization and then notify the system owners of the systems" fallibility as a public service or find security holes in software and then publish them to force the software ‘vendors to ereate patches or fixes for the problem. Google Hacking is the most popular technique in hacking activities that publicly introduce by Johnny Long around 2004 that define as “the art of creating complex search engine queries in order to filter through large amounts of search results for information related 10 ‘computer security” [16]. Attackers can use Google Hacking to uncover sensitive information about a company or to uncover potential security vulnerabilities through Intemet ‘even some people could use Google Hacking to determine if their websites are disclosing sensitive information which known as penetration testing or vulnerability assessment. ‘When a computer connects to a network and begin communicating with other computers, it is essentially taking, a risk refer to Internet security that involves the protection of a computer's Internet account and files from intrusion of an unknown user. Basic security measures involve protection by well-selected passwords, change of file ‘permissions and back up of computer's data, In this paper, implementation define as “carrying out, execution, or practice of a plan, a method, or any design for doing something” [16] and impact define as “a forceful consequence; @ strong effect” that relates to the using methodology of Google Hacking by certain malicious people or group that effect the Internet Security I, BACKGROUND Nowadays, itis really hard to search information as the umber of resources available on the Internet is increasing at a rapid rate. Consequently, search engine that is also Imown as web search engine or automated web search, which is one of the services provided by Intemet has been introduced, Search engine has been designed to help in finding information stored on a computer system or on a Website and help to minimize the time required to find information [8]. One of the most powerful, efficient and effective search engines is Google, Currently, Google search engine has up to 12 billion pages [17] and whether believe it or not, it is actually the starting point of many hacking activities later and in fact, it is also one of the most interesting uses of Google search engine by certain people, this kind of activity is known as Google Hacking, ‘Organizations usually disclose too much information on their Web servers without ever knowing that the leak or ‘weakness in there, somehow it’s utilized by malicious hhacker. Further, search engines like Google has powerful features that allow users to find some sensitive information stored in the far comers of Web-connected servers and even perform a vulnerability-searching attack. Inthe past 2 years, Google hacking is a term that has not only become commonplace in the security community but in the ‘mainstream media as well [13]. Apparently, Google hacking involves using the popular Google search engine to locate sensitive or confidential online information that should be protected but they are not. Using search engines to uncover sensitive information is not a new concept. Nonetheless, with the numerous advanced search operators that Google makes available on its enormous database, carefully crafted query strings can reveal jaw-dropping results [3]. Usually, The filtering methods are performed through using advanced Google ‘operators while attackers can use Google Hacking to uncover sensitive information about a company of to uncover potential security vulnerabilities. A security professional can use Google Hacking to determine if their websites are disclosing sensitive information. Google Hacking tumed out to be a very powerful and flexible hhacking approach, it also found was very helpful to use Google cached pages while performing Google Hacking [2]. Google crawls web pages and stores a copy of them on its local servers. They have tried to use Google cached pages to anonymously browse a target's site without sending. 2 single packet to its server. Google grabs most of the pages through crawls but omits images with some other space consuming media, When viewed Google cached pages by simply clicking on the cached link on the results page, the hackers will end up connecting to the target's server to get the rest of the page content. ‘When a user enters a keyword in a search text, the spider ‘will start exploring the Webs, Then the Google later on willl retum a results page that consists of a name list of the site, a summary or snippet of the site, the URL. of the actual page, 2 cached link that shows the page as it looked when the spider last visited the page and a link to pages that have similar content [3], Google's search resulls are dynamic, When a query is submitted through Google’s web interface, Google takes user to a created results page that can be represented by a single URL that will appear in user browser's address bar, For instance like the following URL: hutp://www.google.com/searchhl=endeq=%22peanut utler+and?422+jelly&binG=Search "The question mark (?) denotes the end of the URL and the start of arguments, the ampersand (&) separates ‘arguments, (BL) represents the language in which the results page will be printed, (q) represents the start of the query string, (%422) represents the hexadecimal value of the double quotes character, the plus sign (+) represents a space, and (btnG=Seareh) denotes that the Search button was pressed on Google’s. web interface [13 Knowledgeable Google users can edit the URL directly inside their browser's address bar and hit enter to get the new search results in a very quick way. As a security professional, itis critical to understand these URLs, so that no one can perform a Google hacking vulnerability assessment. The Google Hacking Database (GHDB) is a database of queries used by contributed hackers to identify sensitive data on your website such as error messages, files containing passwords, files containing usemames (no passwords), files containing juicy info (no usemames or passwords, but interesting stuff none the less), pages containing logon portals, pages containing network or vulnerability data such as Firewall logs, sensitive online shopping information, various online devices and vulnerable servers [17][19] IIL, IMPLEMENTATION Google hacking already became popular and famous not only in the hacker communities, but it also in common people who don't really understand the hacking procedure. The method and design in using Google search engine for hacking, activities involves the combination between basi and advance operator in Google to maximize the specilie searching and finding, it could be divided as formal design that refer to gaogledork and gaogleturds which introduce by Johnny Long and recognized by academic and media level, Google automated scanner that refers to specific software bbe built by communities oF person to facilitate Google hacking, manual exploration that refer to single attempt by certain malicious hacker in enhancing Google hacking Knowledge and lastly, the integrated hacking which put Google hacking as the beginning process before do other ‘method of hacking. Table 1. Advanced Googie Operator |Search Service Advanced Seareh Operators allinanchor:allintext allintite:, llinut: cache: define fletype, i, inanchor: inf, ‘Web Search | rex initle, in, phonebook: relate, image Sere | Mn aia, type na, nes, allintext, alin, author, group: Groups insubject, intext intitle Diciony | nota, lina, ext, Hep allintext aint, allinur intext, itl News inurl; location: source: Product roduc allintext, allntit The one of well-known implementation to ullize the Google Search engine that is “googledork”, It is the attempt to standardize function of Google Hacking by the first introducer, Johany Long in his website as sharing Knowledge. The term "googledork” that was coined by the author has originally meant "An inept or foolish person as revealed by Google” [19] but afler a great deal of media attention, the term came to describe those who “troll the Internet for confidential goods." Either description is really fine but what matters are that the term googledork conveys the concept that sensitive stuff is on the web and Google ccan help you find it. The oficial googledorks page lists many different examples of unbelievable things that have ‘been dug up through Google by the maintainer of the page, Johnny Long, are around 14 categories of them refer to GHDB [I7][19]. Each listing shows the Google search required to find the information, along with a description of ‘why the data found on cach page is extremely interesting. ‘On the other hand, syntax and operator function in Google search engine docsn't miss the error or the ‘weakness either, these one recognized as “Googleturds” that defines as the litle dity pieces of Google ‘waste’ [1]. These search results seem to have stemmed from typos Google found while crawling a web page. Google also can reveal ‘many personal data when its advanced scarch parameters are used. The implementation of googleturds along before is {quite amazing with the revealing of eredit card, web directories, password and many more. Google concern more to correct the ertor in their operator and patterns because the hhigh pressures from certain organization and companies towards the huge impact of the uses these errors, Google hacking involves the use of certain types of search queries to look for Web site vulnerabilities, More than approximately 1,500 such queries, mostly store in the Google Hack database website by Johny Long, some of just spread to the other discussion website or blog. This Google Hacking tend slowly but surely brings more people to develop the easiest tools or software to search effectively and efficiently that combine certain methods of the poogledorks inside it. Unfortunately, sometimes ‘organizations set up their systems in a way that allows Google to index and save a lot more information than they intended [23] Another implementation is Google automated scanner, ‘one of the famous one is Goolag Scanner, a Windows-based ‘auditing tool that was built around the concept of "Google hacking”. The Cult of the Dead Cow hacker group released ‘an open-source tool designed to enable IT workers to ‘quickly scan their Web sites for security vulnerabilities and at-risk sensitive data, using a collection of specially crafted Google search terms to provide a very easy and legitimate tool for security professionals ta test their own Web sites for vulnerabilities, and to raise awareness about Web security in and of itself” [22]. Actually, many attempts have been done in implementing Google hacking process. ‘automatically using the software even the newbie could use it at all. Goolag Scanner, Goosean, Google Hacks, subdomain Lookup, etc are the example of software to facilitate these activities, pethaps time by time will increased more along with the popularity of Google which become great Interestingly, the approach will bring many people try ‘mote to use Google operator in searching private data ‘thorough the Internet intentionally that force organization or ‘company put their best effort to prevent them, On contrary, the malicious expert hacker could utilize this kind of technique whereby massive newbie try to use Google hhacking, Once they know the pattern, the in-depth searching ‘of implementation through manual exploration could be done effectively and efficiently. The ease of use Google Hacking somehow really incredible, every time and everywhere we could use Google search engine to find out private data in various purpose. The private data searches ‘grouped into four different seetions according to the privacy level. These are identification data, sensitive data, ‘confidential data and secret data searches (21) Identification data relates to personal identity of user, it could be found out by keywords like name, address, phone, email, curriculum vitae, usemame, ete. optionally for a certain person or within certain document types. like following query which find out many list of identity: allintext:name email phone address ext:pdf Meanwhile, sensitive data relates to data public but contain private data whose reveal might be anger the owner, like emails, forum postings, sensitive directories and Web2.0 based applications like following query which find ‘out sensitive directories: intitle:"index of inuelybackup ‘Whereas the confidential data relates to private data that could be access by certain group or person only like passwords, chat logs, confidential file, online webcams, ete ‘but Google still could reveal this kind of data like following {query which find out address of online webcam: inurl" viewerfiame?mode=motios Lastly, secret data that relates to private data accessible only to the owners like encryption messages, private keys, secret keys, ctc. Finding encrypted message could be found by following query like: sintext i" extiene All kind of manual search exploration only needs the in- depth knowledge in format, pattems, operator, perimeter and practice. Certain person could be independence or autodidact to be expert in Google hacking. However, the ‘worst case in implementing Google search engine is by combining it with other hacking methods so it only the beginning process like footprinting, port scanning, information gathering, etc to further hacking process which known as integrated hacking. The prediction to utilize Google search engine as only the first step to gain current weakness or vulnerability in certain method became the hot topic relates to the process 10 prevent and recognizes it, Nowadays, pretty much any hhacking incident most likely hegins with Google [1] so utilization of Google Hacking is only the beginning but the impact resides in there hazardously. The organization, comporation or even the single individual which store their data in the website should develop their own strategy, policy and procedure to keep secure and safety their own data from being revealed by somebody. Iv. IMPACT The search engines especially Google itself already ‘became the important tool in our daily activity by using Internet so it will difficult to differentiate in the beginning which user or group has intentionally to do Google hacking. Consequently, there is no certain method to idemtify them but only the protection and response from both Google Corporation and the website system administrator can be ‘measured at this moment. The massive attack of Google hacking have given the direct and indirect impact to Internet security, in this study we classified the directly impact into Tow impact which associated mostly in exploration, ‘moderate impact which associated mostly in exposure and hhigh impact which associated mostly in exploitation, with oth positive and negative impact, somehow trend and standardization; awareness on Internet security; strategy, ppolicy and procedure become the indirectly impact. er . | Fice access online newspaper | Finding information regard certain people Google lock the ‘Mass quantity Google backing user “poogleturds” Find Sub domain lookup | Google hacking as Aisi Inxeligence Google Proxy Server hacking | Awareness on Interact, “Moderate Impact Find out vulnerabilities in Google Analytics serve, files, web files, web | Implementation Checklist, application, unauthenticated program, various online devices, ete Application Fraud Fahhanoe Tools to Proteet he Privacy Penetration Testing Google Macking Exposed VoIP Footprinting ‘Standardization Tacking Process Analysis Sovial Engineering (Ceriicaton Ethical Hacker Google Hacking as Malicious | — Honeypot Project Google Code hacking trap [ Tigh Impact, ‘Oracle Database Explovation Fraud Prevention Telecommunication ‘Safeguarding privacy against, ‘misuse and exploits Tadexes & Reveal Sensitve/Confidential Data ‘SAP Enterprise Paral Fnhance Defense Security ‘Security Exposure Surtegy denliy The Fraud Management Indust “avanced MySQH Tntemet security management Exploitation Development (Prevention, Protection, Response Recovery Thtemet safety, seurity and privacy manuals ‘Apache Database Exploration | Table 2. Google Hocking Impact ‘The last several years, identity thefl has been one of the fastest growing crimes, Unfortunately, the Internet has been facilitating this phenomenon since it represents a tremendous open repository for sensitive _ identity information available for those who know how to find them, including fraudsters [20]. The exploitation of benefits from, ‘one identity is really worst case let alone the many identity being used by certain user, it will give huge damage to ‘certain company, the system and the related users themselves. Google backing somehow became the trend in information communication and technology community even they made the standardization as the process of developing and agreeing upon technical standards among themselves. Both end and standardization also become the hot discussion influence by the massively use of Google hacking Nowadays, information, systems, and networks are pervasive and ubiquitous, all of them provided throughout the Internet. Internet's vast resources are an excellent means for everyone to explore, research, and enjoy new information and interests, The Intemet is a public place, however, so it became important to teach the Internet user how to be safe throughout the Internet because it also lies the dangerous in there. Recently, we need to lear about Google Hacking to provide a good level of protection for our sites and to check for sensitive information disclosure as strategy and policy in anticipating Google hacking, As we become more familiar with manual backs, we can start using some of the automated Google Hacking tools. It will automate the hacks but it is ensuring that every single page within our site is protected. Automated toois allow for periodic secur checks with frequency that is simply impossible to achieve ‘with manual hacks, Here the common activities based on the strategy, policy and procedure in assuring Intemet security that usually organization proceed [18], 1, Ensure host and network security basies are in place, 2. Buildipublish security features (authentication, role management, key management, audivlog, crypto and protocols. 3. Use external penetration testers to find problems. 4, Create share standard policy. 5. Identify gate locations and gather necessary artifacts 6, Know all regulatory pressures and unify approach, 7. Identify personally identifiable information (PII) obligations. 8, Provide awareness training 9. Create security standards 10, Perform security feature review. I. Tdentify software defects found in operations monitoring and feed them back to development. 12. Ensure QA supports edge/boundary value condition testing 13, Create of interface with ineident response. 14, Create data classification scheme and inventory. 15, Use automated tools along with manual review. The sharing knowledge on the strategies, policies and procedures is the advantages own by every companies or ‘organization to fight back the Google hacking threat, even though the full security itself never exist, somehow it really prevent the confidential or sensitive data towards the kiddy ‘or newbie of Google Hacking that usually increase day after day as mountain as the impact of the openness of knowledge in the Internet, The discussion and improvement, should be done frequently, just in case to expertise the strength of the strategy; policy and procedures while the process of assuring and monitoring also should be done cffectively. The maintenance of these three approaches is really difficult like other process of maintenance. Apparently, we could not deny i because the management risk is\ the attempt to prevent such disaster occurs towards security of the company. It's better to prevent rather than a cure, in this cease a security measurement rather than a disaster recovery, Google backing could become serious and great threats to fan organization or company. If a hacker spent enough time analyzing the target and understanding how the queries found information, they will be able to find the information that they want even if the information are confidential or sensitive. Moreover, the well-known implementation to utilize the Google Search engine, “Google dork” and the Google automated scanners will help hacker to easily find sensitive data using Google. It shows that almost effortless for today hacker to do Google hacking V. CONCLUSION ‘To be publicly accessible is the nature of web sites and applications. Combined with search engine functionality, it makes it effortless for attackers to access an organization site or find out information about the organization, Some ‘organization did not realize that even directory listings, error pages and hidden login pages can be indexed and when a search engine “indexes” a site, it inadvertently providing loads of information for potential attackers. This is what Google hacking all about. However, there are some probable solution and prevention to these’ problems. The best way 0 face the Google hacking by doing the basic risks management and those are prevention, protection and response. Prevention from Google hacking relates to anticipation ‘of needs, management wishes, hazards and risks [4). One of the strategies in approaching the Google hacking prevention are by reducing risk in the existing website so that the disaster has the lower probability of occurrence. However, those methods can be implemented if the services are provided only in the low scale without the need of mobility ‘and intense of shating data such as keep sensitive data of, consider removing site from Google's index [7], automatic scanner, mitigation data, run regularly schedule assessment, ete. Protection from Google hacking relates to the process of | keeping something such as confidential information or sensitive data safe from being hacked. For instance, @ security token associated with a resource such as a file Usually, the approach an organization could take for protection is by balancing their availability, integrity, confidentiality and performance such as installing firewall, Robot.xt [6], Google Hack Honeypot [5], etc Response from a company that has been hacked by Google hacker is very important in order to avoid the ‘occurrence of the same kind of incident. If the company did not respond to this incident, there might be another sensitive ‘or confidential data stolen by the hacker. Furthermore, since the replacement cost le say, for stolen research data is high, ‘of course, a company do not want to cover the cost again after one incident such as report the incident, educate the ‘employee, incident Response Policy, ete, "Thus, we as a user should aware on the situation that ‘even hidden login pages can be indexed and when a search engine “indexes” a site, it unconsciously providing treasure ‘woves of information for potential attackers. We should improve our information security in order to avoid our ‘confidential or sensitive data from being hacked and finally, ‘we should also prepare ourselves with the best methods of protection and prevention from Google Hacking. REFERENCE, (1) Long, J (2004), Google Hacking Min: Guide, reieved on | Feb 2010, fiom ip informs com/artcesartcle ype? LTO880A seg [2] Big T, Dasehenke,¥. and Fran, C208), “Fratton of ona Machin Proce of IfSescD Conference OS, 27.32, Sepember 2-7, 2008 Kemnesn A, USA [5] taco. Workman, R207 Cxng Got Hacking to Enhance Deferce Satie, Proceedings of IGESE 07x 491-895 18] Hawi Pala (HE), Preventing Googe hacking est proet ow web applation, raised n't teh 2010" fom Tein date sm lnes 29902 ton! S30Sen eon ng ang 15] SowteFogend, What CHE’ etovol on 1 Feb 2010 om rp aihaoureroge nt (61 The rt Rabon Popes ‘upon abana {7} Webmaster Tok Removing my ow cont fom Goole, eed om 1 Fb 2010 fom ho ios ggtecom ene hl {8} Comer D200) The het Book vertng Tow Ned to Know “hot Computer Networng td How Be terns Work BS, Premise a {o) kao) (003), Hacking: The rt of Exploitation, No Sch rs {19} orn, TOs, Hackng ‘Distal Mosie nd Teco! ‘errmint Pay {U1 Kil 7 Soo MON), veh ou Nendo Kao abot he ‘are of Computer Hacking. The Hosea Publishing Group [12] tayo 1 (005) Hacking Goole for Pun and Profit. Reined ebay” 2010," fet Chanel Regie webnte [19] Lng Ted Skoud, F 05), Googe Hacking Jor Penton Testers Serre 114 sind D1 Cras, M. 000, See of the Cbercrime 2B Smes {151 Wang} 000, Computer Neork Sear Springs {ie} Wikipedm. 00, Goel Mating. Retined bry 8, 2010 rem hay en lige ang Gag hcing iT) Verena XP (2007. Gog Hacking, Rev ay, SHO fom Coleg of Ener an Jose Sate Unveriy wate Iw ngr sed mernaleousesonpe23hnudenprese a ats GOOG PCILACRNG a {U8} Metin. 6, Ces, Ban Migs, 8 2010). Sofware fnew: Wit Wark Sofware Scary, Rescve Fey 28 2010 fom: hun. rm comaesurleprp 1308005 19} GHD, Gooey, Gaope Hacky atc Reeve Daas 3 2010 st pin ara com? (20) Abii” Aan Toe Tse G00) The Impct of Googe tasting om enti and Appian Prd PACKIMO? (21 Emin ars Tai (200) Google acting gis Pris, Keser Sonia, Gey ly 26 (221 Vina) 200, Hake grup rene ntomated ‘oot hacking wo Revived Fein 32010 a ‘ping compurword cm eileDO82238acer grou ater sional Cross asin ta 25) Sele § D200), 5 Way Gone Shaking the Security Worl Rete Fey a Mai ae [pin csvonnecom aisle 22131V/5 Ways Guole Is St Hinge Sein Wor pane= retteved on 1 Feb 2010 fom

You might also like