The Deep Web Essay
The Deep Web
Did you know that when you Google “Red Wolves,” Google will only search pages that it has indexed, which is only about four percent of the total World Wide Web (Why Businesses Needs to Understand the Deep Web, 2013). The other ninety-six percent is known as the Deep Web, or Invisible Web. This unknown realm is known for housing terroristic communication, gun and drug trading, assignation bids, and even child pornography. But the Deep Web also holds private files for sharing and storing for business, academic, and personal use. Like typical websites, these pages are still encrypted and decrypted, but typical search engines will not have the websites private key needed to decrypt these pages therefor cannot be indexed.
The Darknet is the specific part of the Deep Web that houses the illegal activity. This is mainly because it is ran by private networks link from peer-to-peer. It is suggested that when exploring this part of the web a user will need use a browser that will keep his/her location anonymous. The Onion Router, or Tor, is an example of such browser. (Tor: Overview, n.d.) How Tor achieves this by bouncing a connected computers IP address, which shows the location of where a user is connecting from, i.e. home, to volunteer nodes where it hides the IP address under many layers, like an onion. Why? Well this is a criminal underworld full of hackers and other cyber criminals.
The Deep Web, as defined in Learning to Crawl the Deep Web, is the portion of the World Wide Web that is not a part of the surface web; in other words, web pages that is not indexed by search engines. Search engines usually use “crawler” to index web pages, following links, and repeat. This will continue until it has a large catalog of the internet. Though a vast majority of pages cannot be accessed by these crawlers. There can be several reasons for this; the page may not be linked, it may have a log-in page which crawlers cannot pass, or because the page may not want to be indexed. Database driven websites are often not indexed which is a problem when a user would like to look up scholarly articles.
There is, after all, different results when you search Google than when you search ASU’s Scholarly Article databases. It is not just Academic databases, but databases that include your bank account, credit cards, and other private or sensitive information you would not want someone to be able just to search via Bing. Cloud computing uses the internet to store and share files for consumers and businesses. These files are also excluded from being indexed and therefor is a part of the deep web. Emails also travel through the World Wide Web, and because crawlers do not have access to private keys, which is needed to decrypt said emails, and cannot be indexed meaning they travel through the Deep Web. You may have been on the Deep Web and not even have known it.
Academic database websites are very useful when conducting research. Problem is a majority is not indexed. These databases are known as the Academic Invisible Web, or AIW. (Lewandowski & Mayr, 2007) When a user wants to do research, they expect, when searching Google or Bing, for that search to be complete and up-to-date Information professional, who gather information from multiple sources and combined search environments have to use over-simplified general search engines that would have a problem obtaining insufficient indexes. (Lewandowski & Mayr, 2007) Google Scholar is an example that can search a fair amount of these academic databases in the Invisible Web, but with questionable quality.
This could be explained several ways, one of which is the mere size of the Deep Web. It is hard to know exactly how big it is and even the best estimates are off. M.K Bergman gives a surface to Deep Web ratio of 1:550. (Lewandowski & Mayr, 2007) He bases it off of the sixty largest academic databases. The issue with this ratio is that these sixty websites contain eighty-five billion total documents totaling 748,504 gigabytes; the first two databases on the list contain 585,400 gigabytes. That is seventy-five percent of total amount of data skewing the results. Bergman investigated further and calculated the mean size of 5.43 million documents making up the Academic Invisible Web. This causes the same problem as before because he uses his same sixty database. The median of the number of documents per academic database is 4950.
Yet again the distribution of size skews the results. This makes it difficult to determine the size of the academic invisible web with just Bergman’s estimates. We can, however, compare his numbers to the Gale Dictionary of Databases. (Williams, 2005) Gale’s dictionary does not include some of the larger databases that Bergan used but that to have a more even distribution of size. These number will also be significantly smaller estimates that when compared with Bergman’s we can obtain a rough range of twenty to one-hundred billion documents in the Academic Invisible Web alone. It is important that these databases be indexed so that researcher can access this information quickly and easily. In the business world, if a company has search engine with these website indexed it would mean quicker research and a step ahead of the competition.
There is a deeper dark corner of the World Wide Web called the Darknet. This is where anonymity is a must by using programs like The Onion Router, or Tor. If someone know the sourc and destination, they could be a threat to you electronically and physically. Once on the Tor browser, ou can explor the Darknet and its dark market place. This market uses an electronic currency called Bitcoin. Bitcoin is a peer-to-peer payment system introduced as an open software in 2009. (Wikipedia, 2009) Bit coins are not controlled by a central pwer like a bank; rather created as a reward for users who offer their time and processing power to verify and record transactions into a public ledger.
At this moment, there are 12 million Bitcoins in circulation and the amount of Bitcoins to be produced is capped at 21 million. Bitcoins worth and fluctuated vastly since its introduction. In 2011, one bitcoin was worth roughly seventy-five cents; in December of 2013 the value was 11,200 dollars per bit coin (Valdes, What is Bitcoin? How does it work?, 2014), this fluctuation shows instability in bit coins but has recently begun to stabilize. This depth is also where cyber-thieves use malware, like a Trojan horse, spyware, virus, and worms. Cyber security companies, working with crime fighting agencies, can monitor these activities and act accordingly helping businesses keep their sensitive files safe. (Ellyatt, 2013).
There are basically three layers of the World Wide Web, Surface, Deep Web and, within the Deep Web, Darknet. Everything you find on Google, Bing, or any other typical search engine, is the surface web. Everything else is considered Dark Web. This is where everything from databases to private albums swims about. Peer-to-peer connections to keep anonymous is the Darknet. Programs like Tor are recommended to use while exploring this part of the Deep Web to keep your IP address masked. Business uses the Deep Web to store their databases while cyber security and law enforcement agencies work together to make sure the Darknet does take this information.
Ellyatt, H. (2013, Monday 9). How business can shed a light on the ‘dark net’. Retrieved from CNBC News: http://www.cnbc.com/id/101234129 Kharpal, A. (2013, November 7). Copycat Silk Road drug site reopens after FBI raid. Retrieved from CNBC News: http://www.cnbc.com/id/101178729 Lewandowski, D., & Mayr, P. (2007). Exploring the Academic Invisible Web. Tor: Overview. (n.d.). Retrieved from Tor: https://www.torproject.org/about/overview.html.en Valdes, A. (2014, March 17). The Deep Web: Everything you need to know in 2 minuets. Retrieved from Mashables: http://mashable.com/2014/03/17/deep-web/ Valdes, A. (2014, Febuary 10). What is Bitcoin? How does it work? Retrieved from You Tube: http://www.youtube.com/watch?v=ZT26y_l-jtI&list=PLSKUhDnoJjYn0TV9V84C4Wr2DjKPc492c&src_vid=f-IOUuIxejA&feature=iv&annotation_id=annotation_484507783 Why Businesses Needs to Understand the Deep Web. (2013, August 18). Retrieved from Mediabadger: http://www.mediabadger.com/2013/08/why-business-needs-to-understand-the-deep-web/ Wikipedia. (2009, January 3). Bitcoin. Retrieved from Wikipedia: http://en.wikipedia.org/wiki/Bitcoin Williams, M. (2005). The state of databases today: 2005. In Gale Dictionary of Databases (pp. XV-XXV). Detroit: Gale Group. Zheng, Q., Wu, Z., Cheng, X., Jiang, L., & Liu, J. (2011). Learning to Crawl Deep Web. Elsevier.