GitLab CI/CD Configuration for Running Semgrep

This document provides a breakdown of the .gitlab-ci.yml configuration used to run Semgrep for static code analysis in a GitLab CI/CD pipeline.

Pipeline Overview

The pipeline is set up to execute Semgrep within a single stage called semgrep. The scan results are stored as a JSON artifact and retained for 1 week.

Configuration

stages:
      - semgrep
    
    run_semgrep:
      stage: semgrep
      image: python:3.8
    
      before_script:
        - pip install --no-cache-dir --upgrade pip
        - pip install --no-cache-dir virtualenv
        - virtualenv venv
        - source venv/bin/activate
        - pip install --no-cache-dir semgrep
        - source venv/bin/activate && semgrep --version
    
      script:
        - export TZ="Asia/Kolkata"
        - TIMESTAMP=$(date +%Y-%m-%d:%H.%M)
        - FILENAME="semgrep_report_${TIMESTAMP}.json"
        - source venv/bin/activate && semgrep --config auto --json --verbose > $FILENAME
    
      artifacts:
        paths:
          - semgrep_report_*.json
        expire_in: 1 week
    
      only:
        - branches

Explanation of Configuration

Stages

Defines a single stage named semgrep to run the Semgrep scan.

Job (run_semgrep)

before_script

script

artifacts

only

Usage

  1. Ensure you have a GitLab Runner installed and configured.
  2. Add the above .gitlab-ci.yml file to the root of your repository.
  3. Push your changes to trigger the pipeline.
  4. Review the Semgrep report artifact in the pipeline's job output.

For further customization, refer to the Semgrep documentation.