Organizational level content analysis We can use Regex to identify stylistic errors or asses the impact of content changes. As we develop documentation as code, there are a number of lightweight programs we can use to search across multiple documents simultaneously. GitLab and GitHub can do this from within their environments, the below addresses searching locally on our own machines. We can do this as we write and store our content as pure text - not as any type of proprietary format. This ability to conduct organization wide document searches means we can save a lot of time and effort when we need to identify how a product or company update might impact existing documentation. For example: * The company was recently bought out and we need to change legal names across all our documentation - with Regex we can do this simultaneously for all documents within a few seconds. * We updated some product specifications and need to identify which existing documents mention the affected information. We can also use Regex within a single document to quickly identify types of language we want to avoid. For example, avoiding using the passive tense too much in instructional documentation. We produced the below outputs running the following Regex expression on this website: `\b((is|was|are|were|has|have|had) (\w*ed|shown|taken|understood|chosen|come|found|gotten|known|made|thought|seen|been|gone)|will be|got|(had|made) (a|an|the)|should|shall)\b` It shows types of languages we want to avoid and where they occur. We can do this simultaneously for several documents also. We can also change what we search for or even find sentences or words over a certain amount of characters. We can also use a similar find and replace function, so we can quickly identify text strings and update them. As we use continuous integration pipelines to distribute documentation automatically, we can conduct wide scale updates without having to open, change text, re-generate, and re-distribute documents.
2 minute read | Concept

Organizational level content analysis

We can use Regex to identify stylistic errors or asses the impact of content changes.

As we develop documentation as code, there are a number of lightweight programs we can use to search across multiple documents simultaneously. GitLab and GitHub can do this from within their environments, the below addresses searching locally on our own machines. We can do this as we write and store our content as pure text - not as any type of proprietary format.

This ability to conduct organization wide document searches means we can save a lot of time and effort when we need to identify how a product or company update might impact existing documentation. For example:

We can also use Regex within a single document to quickly identify types of language we want to avoid. For example, avoiding using the passive tense too much in instructional documentation.

Example

We produced the below outputs running the following Regex expression on this website:

\b((is|was|are|were|has|have|had) (\w*ed|shown|taken|understood|chosen|come|found|gotten|known|made|thought|seen|been|gone)|will be|got|(had|made) (a|an|the)|should|shall)\b

It shows types of languages we want to avoid and where they occur. We can do this simultaneously for several documents also. We can also change what we search for or even find sentences or words over a certain amount of characters.

We can also use a similar find and replace function, so we can quickly identify text strings and update them. As we use continuous integration pipelines to distribute documentation automatically, we can conduct wide scale updates without having to open, change text, re-generate, and re-distribute documents.

See also
Atom find and replace

Home | Contact