The SEO purist may argue why anyone would ever want to use PDF
content on a website for search purposes. The reality, however, is that
many businesses have a lot of PDF assets. These may include sell
sheets, brochures, white papers, technical briefs, etc. The purist
simply says why not convert these to html? In the real world, not
everyone has the time, budget, and expertise to do that. There may also
be other “marketing” reasons. Perhaps a company wants its
prospects to experience the content along with all the other brand
elements inherent in its print materials. Whatever the reason, there
are lots of PDFs available on the web, and you can optimize PDFs to get
high-ranking search results. Here are some tips on the right way to do
it.
1. Make sure your PDFs are text based. Okay, this first one
is pretty obvious. However, we still find companies whose materials
were designed in an image-based program. When the PDF is made using
these programs, the PDF is an image; there is no text for the search
engines to read.
2. Complete the document properties. It seems like the vast
majority of PDFs are without specified document properties, the most
important of which is the Title. The Title property, if present, almost
invariably represents the words that will be displayed as the heading
of the search result. It’s the equivalent of the html title tag.
If you don’t complete the Title property, the search engine is
going to generate a title from the PDF’s content, and it may not
be what you would choose. We’ve all seen some pretty goofy
looking titles to search results associated with PDFs. Not only do they
look ridiculous, but they probably won’t get clicked. In the full
version of Acrobat, go to File>Document Properties to specify the
Title.
There are other document properties (meta data) you can supply,
including Author, Subject, and Keywords, but presently these appear to
have little search-related affect. It would be nice if Subject acted as
the meta description to be displayed under the heading of the search
result, but I haven’t seen this to be true. For now, however,
I’d complete the Subject property as if it were a meta
description. Perhaps in the future search engines will treat it as such.
3. Optimize the copy. Copy in text-based PDFs is no different than web-page copy. Optimize it.
4. Build links into PDFs. Make sure you include links in your
PDFs, and pay attention to the anchor text used. Search engines do
recognize these links. Not very often, but sometimes you’ll find
backlinks in PDFs. Their limited occurrence, however, is likely related
to the fact that most people don’t put links into PDFs; most
people treat PDFs as static print documents. In addition to including
links in PDFs for search-related purposes, there’s also a good
business reason. Often, PDFs are passed along to others via email.
Accordingly, a reader may be viewing the PDF in isolation (i.e., not
associated with your website.) By placing links into PDFs, you give
these readers an easy way to click back into your site, where you can
further influence them.
5. Pay attention to the version. While search engines do
“read” and index PDFs, search engines’ capabilities
tend to lag new versions of Acrobat. Although Acrobat 8 is out, for now
you should save your PDFs as version 1.6 (Acrobat 7) or lower to ensure
search engines can index the content.
Not only is saving PDFs at a lower version good for the search
engines, it’s also good for users. Not everyone has the latest
versions of Acrobat Reader. Accordingly, I’d recommend saving
PDFs as version 1.5 or lower. This way it will be good for search
engines and most readers.
6. Optimize the file size for search. Don’t post a huge
PDF for download. Not only is this annoying and unnecessary for site
visitors, it’s also burdensome for the search engines. If
it’s too big, the search engines may abandon the PDF before even
getting access to its content. Using the full version of Acrobat,
select Advanced>PDF Optimizer to “right-size” the
document.
You may also want to enable the “Optimize for Fast Web View” option
in the Preferences>General Settings panel. This allows the PDF to be
“loaded” a page at a time, rather than waiting for the
whole PDF to download.
7. Pay attention to placement. If you bury links to PDFs deep
within your site’s file structure, they’re less likely to
get indexed. If you want to use PDFs for high-ranking search results,
links to those PDFs should be on web pages closer to the root level of
the site’s file structure.
8. Influence meta descriptions for PDFs. For web pages, the
meta description is what is displayed under the title in a search
result. With PDFs, the search engines search the copy of the PDF and
select something to display. While with PDFs you have less control of
what is displayed as the description to the search result, you can
still influence this. The best way to do this is to make sure that you
have a good, optimized sentence or two near the start of your PDF. If
these sentences correspond to the search term used, it’s likely
that these sentences are the ones that will be displayed as the
description under the search result’s heading.
9. Specify the reading order. As noted above, search engines
search the copy of the PDF and select something to display as a
description under the search result’s heading. Depending on how
the reading order of your PDF is specified, this may lead the search
engine to select some pretty strange stuff to display.
In a previous column, Organic Landing Page: A Case Study, I noted a search result for “transit seating.” That search result is noted below:
Admittedly, this is not a very enticing description, and it’s
not likely to get clicked even if it ranks highly in the search
results. Why did Google select this text to display? Because it’s
the first thing Google read in the PDF.
Every PDF has a reading order. Similar to properly optimized web
pages, you want to make sure that valuable content is read first. How
do you know the reading order? With the PDF open and while using the
full version of Acrobat, select Advanced>Accessibility>Add Tags
to Document. Then select Advanced>Accessibility>Touch Up Reading
Order. Then the reading order of the PDF will be displayed.
You can see in the image above that the reading order of the transit
seating PDF does not start with valuable content. Rather, many
extraneous items are “read” before the valuable content.
That’s why Google displayed what it did in the search result. If
you want PDFs to be optimized for search, make sure you understand the
reading order of the PDF and use the Touch Up Reading Order tool to
manage what the search engine will read first.
10. Tag your PDFs You can also add tags to your PDFs,
similar to html tags. Again, with the PDF open and while using the full
version of Acrobat, select Advanced>Accessibility>Add Tags to
Document. Acrobat will give you a document report and recommend things
you may want to consider changing. You’ll have the ability to tag
headings, alternate text for images, etc.
11. Pay attention. Every time you open a PDF, make even a
small change, and save it once again, major unseen things may change.
The reading order may change automatically. You may inadvertently save
it as a higher version. It may get saved using the default size setting
instead of a properly optimized size. If you’re going to further
optimize existing PDFs, may sure you check all of these things before
posting a new version of the PDF.
Galen De Young is Managing Director of Francis SEO,
a firm specializing in B2B search engine optimization, and Francis
Marketing, one of the leading marketing consulting firms specializing
in repositioning B2B companies and their brands. You can reach Galen at
gdeyoung@francis-seo.com.