On December 15th, the Washington Post reported that the white house told the CDC (Center of Disease Control) not to use 7 specific words in their communications.

CDC Banned Words Image

The 7 banned words were: “vulnerable,” “entitlement,” “diversity,” “transgender,” “fetus,” “evidence-based” and “science-based”.

Various media outlets including the Huffington Post argued that this was an Orwellian-like attempt to muzzle communications. Meanwhile, the CDC later clarified their perspective.

Putting aside the controversial nature of this example; content marketers and communications teams struggle with this type of project.

For example, when a product name change occurs, or your brand guidelines tell you to avoid certain terms. Or where you must avoid complex terminology that will confuse your reader.

  • How do you audit for this in both offline and online communications?
  • How do you quantify the impact on existing websites or documents? and
  • How do you prevent the wrong language seeping back into published communications?

So, while Trumps edict is controversial and newsworthy, it uncovers a serious content strategy issue that marketers and communications teams grapple with daily.

How do you keep content clean?

When dealing with non-approved or non-compliant language, there are 3 core questions to ask:

  1. How many occurrences of the offending terms exist?
  2. How much effort will it take to fix?
  3. Which content should we prioritize for fixing?

In the CDC example, this would be nigh on impossible to do by hand. However, a new breed of language analysis tools makes it much easier to audit larger sites and collections of documents.

I used one such tool; VisibleThread Web, to search for the 7 words across the entire www.cdc.gov site.

Auditing the full CDC site for Trumps 7 words

We pointed the tool at www.cdc.gov. & kicked off an analysis.

Here’s the analysis underway:

Analysis of Webpages Image

We then created a “search dictionary” containing the 7 words.

The Results are in: 2,417 individual references to the 7 terms!

Here’s what our analysis found:

  • No of pages on the CDC site: 5,578
  • No of pages containing the banned words: 1,082
  • No of individual references to the banned words: 2,417

Here is the breakdown by frequency of each term.

Banned Word List Image

Drilling down a level, let’s look at the page list.

In this case, the page: https://www.cdc.gov/hiv/group/gender/transgender/index.html has 72 occurrences in total; 71 of “transgender” and 1 of “evidence-based”.

Pages with Banned Words Image

The cost of Changing Content

Now, let’s assume the CDC need to go ahead and remove all 7 terms from all pages. Let’s estimate the cost.

When editing/removing content, there are two types of change patterns:

  • Simple copy edits: for example, use “Product Name XXX” rather than “Product Name YYY”. These involve minor copy-edits
  • Structural changes: where removing references necessitates a rethink of the web structure, often the URL name and underlying page structure. The transgender example above is a structural change.

Let’s assume that:

  • structural changes require 6 person-hours per change, and
  • simple ‘copy edits’ require 30 minutes of effort. Essentially, the time it takes for a good copywriter to edit the sentence, and republish the page.

We’ll also assume that 1/3 of the references require a structural change. And 2/3 require a simple copy edit.

Here’s what we get,

  • Total Number of occurrences we found on the CDC site: 2,417
  • Total Number of structural changes (assume 1/3 of total occurrences): 798
  • Total Number of simple changes (assume 2/3 of total occurrences): 1,619
  • Time taken to fix each structural change: 798 x 6 hrs = 4,788 hours
  • Time taken to fix each simple change: 1,619 x 30 mins = 809 hours
  • Total time to fix all changes = 5,597


  • assuming the average fully loaded cost of a communications/IT professional is: $75/hour,
  • then the cost of the CDC project comes to: 5,597 x $75 = $419,775

So, were the CDC to fully follow through with the possible ban, it would cost the agency about $420k in structural and copy edits alone. This excludes any project management costs.


  • We’d recommend the White House reconsiders banning the 7 words purely from a cost benefit standpoint. Or at the very least, prioritize the most important pages.
  • For more typical projects of this nature, use automation to size the issue.
  • Audit your site using a scanning solution like VisibleThread, then prioritize which pages need the most urgent fixes.