Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
How to Find and Fix Common Technical SEO Issues
Next
Download to read offline and view in fullscreen.

17

Share

Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016

Download to read offline

Ari Nahmani covers the latest in advanced technical SEO at SMX Munich (Muenchen) 2016. Discussions of the deprecated HTML snapshot, Javascript crawlability and indexing, new frameworks, prerendering, server side rendering, prerender.io, isomorphic javascript, and other technical issues related to the future of protecting your index health.

Related Books

Free with a 30 day trial from Scribd

See all

Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016

  1. 1. The Latest in Advanced Technical SEO Index Bloat & Discovery: from Facets to Frameworks
  2. 2. Hi! Good Afternoon. Ari Nahmani CEO / Founder Kahena Digital Marketing ari@kahenadigital.com
  3. 3. TeamClients
  4. 4. index bloat
  5. 5. index bloat
  6. 6. crawl budget
  7. 7. web-tech > googlebot
  8. 8. discoverability
  9. 9. Today’s Session • Technical SEO issues around e-commerce / large site architecture • Preventing index bloat & preserving crawl budget as a core methodology • Current solutions & upcoming threats (JS, AJAX, new frameworks, pre-rendering)
  10. 10. Index Bloat Prevention
  11. 11. Index Bloat Prevention A bloated index = if indexed URLS > “unique pages”
  12. 12. Index Bloat Prevention On an ecommerce site: A bloated index = if indexed URLS > sum(CAT+PDP+Static)
  13. 13. Index Bloat Prevention On a ‘content’ site: A bloated index = if indexed URLS > sum(Articles+Static)
  14. 14. cannibalization
  15. 15. Index Bloat Prevention: Cannibalization
  16. 16. Index Bloat Prevention: Sorts & Facets
  17. 17. Index Bloat Prevention: Sorts & Filters http://www.site.com/guys/tees/?pref n1=bvAverageRating&prefn2=col orGroup&prefv3=LG&srule=sortin gNewArrival&prefv1=4&prefv2=RE D&prefn3=size
  18. 18. Index Bloat Prevention: Sorts & Filters <link rel="canonical" href=”http://www.site.com/guys/tees/" /> • Basic Solution: Strip out the unnecessary parameters
  19. 19. Solution: Filtering Out All Facet Params • PROS: – Avoids diluted / dupe URLs (request, not directive) • CONS: – If you want/need specific parameters indexed and exposed (size, color), need properly coded canonical tag logic, recipe for major leak and confusion. – Considerations w/ pagination & view-all page
  20. 20. Crawl Budget: Facet Parameter URLs
  21. 21. Crawl Budget: Facet Parameter URLs
  22. 22. JS / AJAX Indexation
  23. 23. Index Bloat VS Discovery: JS + AJAX
  24. 24. Index Bloat Prevention: JS + AJAX AJAX Refinement V1 = NO URL CHANGE
  25. 25. Index Bloat Prevention: JS + AJAX AJAX Refinement V1 - NO URL CHANGE, but inactive, different href= URL exists
  26. 26. AJAX Facet Refinements V1 (NO URL CHANGE) • PROS: – Theoretically no parameters exposed to bloat the index • CONS: – Users can’t share refined / filtered content to friends, no accurate bookmarking. (Terrible UX) – Googlebot will still crawl hidden href=' or other JS framework links like Angular: ng-href= (check canonical logic!!)
  27. 27. Index Bloat Prevention: JS + AJAX AJAX Refinement V2 = html 5 history.pushState()
  28. 28. Index Bloat Prevention: JS + AJAX html 5 history.pushState() http://www.site.com/guys/tees/?color=green&size=large
  29. 29. Consistent URL Signals - Navigation Ideal consistency: Navigation URLs = Pushstate() URLs = Canonical URLs = XML Sitemap URLs =
  30. 30. Consistent URL Signals - Navigation Ideal consistency: Navigation URLs = Pushstate() URLs ≠ Canonical URLs = XML Sitemap URLs =
  31. 31. Index Bloat Prevention: JS + AJAX Google preferred pushstate URL version, we had to reinforce (via normal inline href=‘’, canonical, xml sitemap)
  32. 32. AJAX Facet Refinements V2 (PushState URL Change) • PROS: – Users can now share /bookmark the correct content – Added to browser history • CONS: – Still need to have consistent canonical structure due to Googlebot crawling pushstate() – Different hidden URL structure via AJAX facets may require further unpredictable canonicalization logic / further dev work
  33. 33. Indexing AJAX & JS Frameworks
  34. 34. Indexing AJAX & JS Frameworks
  35. 35. Indexing AJAX & JS Frameworks What method exists that we know still works?
  36. 36. Indexing AJAX & JS Frameworks HTML SNAPSHOT
  37. 37. <head> <meta name="fragment" content="!"> Google / Bing crawls with: _escaped_fragment_= Indexing AJAX & JS: HTML Snapshot
  38. 38. Indexing AJAX & JS: HTML Snapshot
  39. 39. Indexing AJAX & JS: HTML Snapshot
  40. 40. Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Server side (phantomJS / headless browser) Pre-Rendered (to bots)
  41. 41. Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre-Rendered (to bots) Server side (phantomJS / headless browser)
  42. 42. Indexing AJAX & JS: HTML Snapshot • Upon crawl of URL with _escaped_fragment_=, serve ’dumbed down’ HTML version of page. • Not pre-rendered, rather simplified. • For example, on ecommerce à a view-all category listing with no dynamic facets. Amazing results from our clients.
  43. 43. Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre or Realtime Rendered (to users & bots) Pre-Rendered (to bots) Server side (phantomJS / headless browser)
  44. 44. Indexing AJAX & JS: Pre-rendering Upon crawl of URL with _escaped_fragment_= 1. prerender.io – middleware via reverse proxy that serves a pre-rendered, cached HTML page to bots OR 2. Server side – the server pre-rendered the JS in cached html pages to serve to bots or does it in real-time (headless browser).
  45. 45. Indexing AJAX & JS: Prerender.io
  46. 46. Indexing AJAX & JS: Prerender.io
  47. 47. Indexing AJAX & JS: BromBone
  48. 48. Indexing AJAX & JS: Server Prerender
  49. 49. Server side (phantomJS / headless browser) Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre-Rendered (to bots)
  50. 50. Indexing AJAX & JS: Server Side bit.ly/javascriptseo
  51. 51. Indexing AJAX & JS: Server Side bit.ly/javascriptseobit.ly/javascriptseo
  52. 52. Indexing AJAX & JS: Server Side bit.ly/javascriptseobit.ly/javascriptseo
  53. 53. Server side (phantomJS / headless browser) Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre-Rendered (to bots)
  54. 54. Indexing AJAX & JS: Trust Googlebot read these first…
  55. 55. Testing JS Indexation: Jscrawlability.com
  56. 56. Validation & Testing: Discovery vs Bloat
  57. 57. Testing: Fetch & Render JS / AJAX
  58. 58. Testing: Slice and Dice the Index Advanced Site Operators site:yoursite.com –inurl:cat.jsp -inurl:prod.jsp –inurl:store.jsp
  59. 59. Testing: Slice and Dice the Index Advanced Site Operators site:yoursite.com inurl:size inurl:cat.jsp -inurl:cid
  60. 60. Testing: Slice and Dice the Index Advanced Site Operators site:yoursite.com inurl:pdp intext:”write a review”
  61. 61. Testing: Automate Bloat + Discovery Check
  62. 62. Testing: Automate Bloat + Discovery Check
  63. 63. Testing: Search Analytics for Bloat / Discovery
  64. 64. Testing: Go To The Source: Server Logs!
  65. 65. Summing It Up • Index Bloat, Crawl Budget, & Testing: Large sites are prone to serious index bloat and wasted crawl budget. Needs diligent testing and an OCD-like attention to detail with the basics. Test often & automate! • JS/AJAX: Pushstate(), JS Frameworks and AJAX present both discovery and bloat challenges. Know the options: short term fixes like HTML snapshot (G+B), and long term re-designs with modern frameworks w/ built in server side rendering.
  66. 66. Dankeschön! Questions? Ari Nahmani CEO / Founder Kahena Digital Marketing ari@kahenadigital.com @AriNahmani
  67. 67. References: • Can You Now Trust Google To Crawl Ajax Sites? • Search Engine Optimization Best Practices for AJAX URLs | Webmaster Blog • We Tested How Googlebot Crawls Javascript And Here's What We Learned • Prerender - AngularJS SEO, BackboneJS SEO, or EmberJS SEO • SMX Munich Advanced Technical SEO Brainstorm - Google Docs • www.simoahava.com/seo/dynamically-added-meta-data-indexed-google-crawlers/ • Speakers | Search Marketing Expo &ndash; SMX Munich • JavaScript + SEO: Better Together &mdash; Medium • SEO AJAX Crawlability in a Responsive Publisher World • SEO Strategies for JavaScript-Heavy Single Page Applications or AJAX Sites | Search Engine Watch • The Basics of JavaScript Framework SEO in AngularJS - Builtvisible • Can Search Engines Crawl Javascript? • https://www.w3.org/wiki/Graceful_degradation_versus_progressive_enhancement#Graceful_degradatio n_and_progressive_enhancement_in_a_nutshell • SEO and JS: New Challenges • BromBone | SEO for your AngularJS, EmberJS, or BackboneJS website. • DIY AngularJS SEO with PhantomJS (the easy way!) | Lawsonry • https://scotch.io/tutorials/angularjs-seo-with-prerender-io
  68. 68. Image Credits: fat-american-1.jpg (1280×955) bigbrands1.jpg (570×383) consistencydemotivator_large.jpeg (480×338) 04-godfather-keep-friend.jpg (518×300) 4da1a1a23dba011a7ba6918986a6b818302b949ae694b27d559cf8e733 08bf7b.jpg (604×392) the-17-craziest-cannibal-attacks-in-history-u2.jpg (520×272) taxonomy-types-800x450.png (800×450) wireframes-homecat.png (1000×460) Check-yoself.jpg (800×1025) Dangerous-Curve-Ahead-Sign-K-6513.gif (400×400) crawlerserver2.png (884×445) beach.png (1196×838)
  • jesusam

    Oct. 11, 2017
  • AndreaBaggio6

    Apr. 5, 2017
  • kudo77

    Sep. 17, 2016
  • Sthembiso0

    Aug. 18, 2016
  • CraigHarkins

    Jun. 24, 2016
  • scaffas

    May. 18, 2016
  • SamUnderwood1

    May. 5, 2016
  • rngirard

    Apr. 28, 2016
  • RenatoLacerda1

    Apr. 1, 2016
  • Dictina

    Mar. 30, 2016
  • wolfgangkoehler1

    Mar. 20, 2016
  • illorenzo

    Mar. 18, 2016
  • Badams

    Mar. 18, 2016
  • oorei

    Mar. 18, 2016
  • anahmani

    Mar. 18, 2016
  • gfiorelli1

    Mar. 18, 2016
  • samuelscott

    Mar. 18, 2016

Ari Nahmani covers the latest in advanced technical SEO at SMX Munich (Muenchen) 2016. Discussions of the deprecated HTML snapshot, Javascript crawlability and indexing, new frameworks, prerendering, server side rendering, prerender.io, isomorphic javascript, and other technical issues related to the future of protecting your index health.

Views

Total views

11,811

On Slideshare

0

From embeds

0

Number of embeds

77

Actions

Downloads

60

Shares

0

Comments

0

Likes

17

×