python - Scrapy shell returning blank array with steam website? -


i've used scrapy before success craiglist, i'm trying scrape steam user names arbitrarily, keep getting blank array in scrapy shell.

the user name element (which xempy example) contained in:

<a class="searchpersonaname" href="https://steamcommunity.com/id/zxzempy">xempy</a> 

the command i'm using scrape actual user names url above is:

response.select('//*[@id="search_results"]/div[3]/div[3]/a/text()').extract() 

the url i'm attempting scrape

https://steamcommunity.com/search/users/#filter=users&text=xempy  

i used chrome copy xpath of element i'm interested in instead of typing hand make sure free of typos, typing out hand, absolute paths, still blank array, when i'm attempting simple string user name "xempy".

what doing wrong? i've used same process scrape craigslist, on steam's website doesn't seem working , can't find actual examples of steam scrapy scripts.

if @ actual source in browser, right click , choose view source see no sign of results, data dynamically added through ajax request https://steamcommunity.com/search/searchcommunityajax.

you have mimic ajax request, have used requests steps same scrapy:

import requests  headers = {     "user-agent": "mozilla/5.0 (x11; linux x86_64) applewebkit/537.36 (khtml, gecko) chrome/52.0.2743.82 safari/537.36",     "x-requested-with": "xmlhttprequest"} params = {"text": "xempy", "filter": "users", "sessionid": "", "steamid_user": "false", "page": "1"} ajax_url = "https://steamcommunity.com/search/searchcommunityajax" requests.session() s:     s.headers.update()     r = s.get("https://steamcommunity.com/search/users/#filter=users&text=xempy")     # need update session id previous gets headers     params["sessionid"] = next(         c.split("=", 1)[1] c in r.headers["set-cookie"].split(";") if c.startswith("sessionid"))     # need update session headers     s.headers.update(r.headers)     # , cookies previous request     s.cookies.update(r.cookies)     result = (s.get(ajax_url, params=params).json()) 

if run code can see json returned:

in [5]: requests.session() s:    ...:         s.headers.update()    ...:         r = s.get("https://steamcommunity.com/search/users/#filter=users&text=xempy")    ...:         params["sessionid"] = next(    ...:             c.split("=", 1)[1] c in r.headers["set-cookie"].split(";") if c.startswith("sessionid"))    ...:         s.headers.update(r.headers)    ...:         s.cookies.update(r.cookies)    ...:         result = (s.get(ajax_url, params=params).json())    ...:         print(result)    ...:      {u'html': u'\t\t<div style="float: right; padding-bottom: 2px">\r\n\t\t\t\t\t\tshowing 1 - 11 of 11\t\t\t</div>\r\n\t<div style="clear: both"></div>\r\n\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="16183171" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/zxzempy"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/b9/b9c886a08cf17c4f1f31ea19148d8b3bbd748762_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/zxzempy">xempy</a><br />\r\n\t\t\t\t\t\t\t\t&nbsp;\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">zxzempy</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\talso known as: <span style="color: whitesmoke">trill</span>, <span style="color: whitesmoke">[tgif] mario batali</span>, <span style="color: whitesmoke">[tgif] mario \xdfatali</span>, <span style="color: whitesmoke">mario \xdfatali</span>, <span style="color: whitesmoke">[tgif\'</span>, <span style="color: whitesmoke">[tgif] mario \u03b2atali</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="280326130" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/xempyjecar"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/89/8928b324ba9c12859283e8be3f11f19d9232033c_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/xempyjecar">xempy -a-</a><br />\r\n\t\t\t\t\tigor<br />\t\t\tserbia&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/rs.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">xempyjecar</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\talso known as: <span style="color: whitesmoke">xempy -a- new season hypee</span>, <span style="color: whitesmoke">brekija</span>, <span style="color: whitesmoke">fairplay organisation</span>, <span style="color: whitesmoke">xempy | csgoshit.com</span>, <span style="color: whitesmoke">xempy | csgorage.com</span>, <span style="color: whitesmoke">\u2500\u2500\u2500\u2554\u2550\u2550\u2550\u2557</span>, <span style="color: whitesmoke">xempythecupcake</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="315139919" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/filipppp"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/ca/caa5747851b5255a2d76699d855bf20e709af3d1_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/filipppp">xempy -a-</a><br />\r\n\t\t\t\t\tigor<br />\t\t\tserbia&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/rs.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">filipppp</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\talso known as: <span style="color: whitesmoke">extreeemeeee</span>, <span style="color: whitesmoke">ratatatatatata</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="258386073" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/lenyagoglov"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/71/71ee8d0519c74cea0352836b188c747b36224f8f_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/lenyagoglov">xempys</a><br />\r\n\t\t\t\t\tted<br />\t\t\tluxembourg&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/lu.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">lenyagoglov</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="257927191" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/rostislavtseychuk85"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/86/8641de85a283f0d23d1cbeb35ee0c0d5ca87a83b_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/rostislavtseychuk85">xempys</a><br />\r\n\t\t\t\t\tgabriel<br />\t\t\tlebanon&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/lb.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">rostislavtseychuk85</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="252811169" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/mochulskayaa"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/76/76c10b0744403468aaf8090f56ca8ddd61338925_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/mochulskayaa">xempys</a><br />\r\n\t\t\t\t\trichard<br />\t\t\tguatemala&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/gt.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">mochulskayaa</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="260028611" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/katerukhina"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/24/24241e97a6caf3bd932a01ea22afc6b3d758f1a1_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/katerukhina">xempys</a><br />\r\n\t\t\t\t\tchristian<br />\t\t\tfiji&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/fj.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">katerukhina</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="292454844" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/purdenkos"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/5c/5c7f9d1b71a68ab8599ae0fe8f2c4e0445348eaa_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/purdenkos">xempys</a><br />\r\n\t\t\t\t\tpatrik<br />\t\t\tcote d\'ivoire (ivory coast)&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/ci.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">purdenkos</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="56000172" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/v2incent"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/ac/ac45a256e0a14712efff255db0105fedd80a4f0e_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/v2incent">ext4ze ` ^0| \'xempy^0\'</a><br />\r\n\t\t\t\t\tv2incent<br />\t\t\t&nbsp;\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">v2incent</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="297670812" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/xempy"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/62/62ea583f7f838562c73cb70e3993e27acd583aef_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/xempy">xempsanity `\xb4</a><br />\r\n\t\t\t\t\tigor<br />\t\t\tserbia&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/rs.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">xempy</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\talso known as: <span style="color: whitesmoke">xempykingofnothing</span>, <span style="color: whitesmoke">x3mpy</span>, <span style="color: whitesmoke">x3mpy * brother\'s on acc</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumholder_default" data-miniprofile="121633219" style="float:left;"><div class="avatarmedium"><a href="https://steamcommunity.com/id/empyrk"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/6b/6b87d7a04bf211a2665b828436ad34e549f2b193_medium.jpg"></a></div></div>\r\n\t<div class="searchpersonainfo">\r\n\t\t<a class="searchpersonaname" href="https://steamcommunity.com/id/empyrk">empyrk</a><br />\r\n\t\t\t\t\tmatteo<br />\t\t\ttoscana, italy&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/it.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>custom url: steamcommunity.com/id/<span style="color: whitesmoke">empyrk</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t<div style="clear: both"></div>\r\n\t\t<div style="float: right; padding-bottom: 2px">\r\n\t\t\t\t\t\tshowing 1 - 11 of 11\t\t\t</div>\r\n\t<div style="clear: both"></div>\r\n\r\n\r\n', u'search_filter': u'users', u'search_text': u'xempy', u'success': 1, u'search_page': 1} 

you need access results["html"] source.


Comments