Extract version numbers from strings

medium

By - Aman Pareek

Last Updated - 15/09/2024

Problem Statement

In software development and project management, version numbers are crucial for tracking releases, updates, and changes. These version numbers often appear in various documents, release notes, or even in code comments. The format of these version numbers typically follows a pattern such as v1.2.3, where:

  • v is a common prefix indicating a version.

  • The version number consists of several numeric segments separated by dots.

Objective:

Develop a JavaScript function that extracts all version numbers from a given block of text. The function should be capable of identifying and standardizing version numbers that follow a specific pattern and return them as an array. The function needs to handle:

  • Version numbers prefixed with v (e.g., v1.2.3, v10.11.12).

  • Versions with multiple numeric segments (e.g., v1.2, v2.3.4).

  • Different formats of version numbers with consistent structure.

Requirements:

  1. Pattern Matching: The function should correctly identify version numbers prefixed with a lowercase v, followed by numeric segments separated by dots.

  2. Output: The function should return an array of version numbers extracted from the text. Each version number should be returned in the same format as it was found in the text.

  3. Handling Edge Cases: The function should handle common variations and ensure that no leading or trailing whitespace affects the result.

Example 1

Input: textString = "The software is available in versions v1.0.0, v2.3, and v10.5.2."

Output: ["v1.0.0", "v2.3", "v10.5.2"]

Example 2

Input: textString = "Please check the release notes for v3.4.5.6 and v12.1.0. "

Output: ["v3.4.5.6", "v12.1.0"]

Solution 1: Extract version numbers from strings using regex

function extractVersionNumbers(text) {
    // Define the regex pattern to match version numbers (e.g., v1.2.3)
    const versionRegex = /\bv\d+(\.\d+)+\b/g;

    // Find all matches in the text
    const matches = text.match(versionRegex);

    // Return matches, or an empty array if no matches are found
    return matches ? matches.map(version => version.trim()) : [];
} 

const textString1 = "The software is available in versions v1.0.0, v2.3, and v10.5.2.";
extractVersionNumbers(textString1);  //output: ["v1.0.0", "v2.3", "v10.5.2"] 

const textString2 = "Please check the release notes for v3.4.5.6 and v12.1.0.
";
extractVersionNumbers(textString2);  //output: ["v3.4.5.6", "v12.1.0"] 
  • Regex Pattern:

    • \bv - Matches a lowercase 'v' at a word boundary (i.e., the beginning of the version number).

    • \d+ - Matches one or more digits (the major version number).

    • (\.\d+)+ - Matches a dot followed by one or more digits, one or more times (minor and patch versions).

    • \b - Matches a word boundary at the end of the version number to ensure complete matching.

  • text.match(versionRegex) - Finds all occurrences of the version numbers that match the regex pattern in the provided text.

  • matches.map(version => version.trim()) - Removes any leading or trailing whitespace from each matched version number.

Frequently asked questions

  1. What format of version numbers does the function handle?

    The function handles version numbers starting with 'v', followed by numeric segments separated by dots (e.g., v1.2.3). It supports multiple segments.

  2. Can the function handle version numbers with different prefixes?

    No, the function is specifically designed to extract version numbers starting with 'v'. If other prefixes are used, the regex pattern would need to be adjusted.

  3. What if version numbers have extra characters or formats?

    The regex pattern assumes a specific format. If versions include extra characters or different separators, the pattern may need modification to accommodate those variations.

  4. How can I modify the function to handle different version formats?

    To handle different formats, adjust the regex pattern to match the desired versioning scheme. For example, you could add support for versions with different prefixes or delimiters.