Skip to main content

Adding an environment

Steps to create a new environment, e.g. adding dev2 as a copy of dev.

1. AWS account setup

  • Create a new AWS account in the 'workloads' sub organisation
  • Go to the AWS organisation parent account and set up the new account in the developers role. At this point it should be listed in the 'AWS account switcher'

2. CDK / Infrastructure

  • Bootstrap CDK:
    cdk bootstrap aws://<accountid>/us-east-1
  • Create a PR to add the new environment to the ENVIRONMENTS object in bin/cdk.ts and a CI/CD pipeline to deploy it
  • Manually run CDK to create the DNS zone needed for SES (or run it in CD and let it fail, then retry once the DNS is set)
  • If you want to use a domain, set the nameservers in the domain registrar to the nameservers in the output of the CDK stack. Can skip if not needed
  • Merge the PR and run the CD pipeline

Note: It may fail on crawler ECS because the ECR image does not exist and the container cannot start. You may need to redeploy this later once the image is pushed.

3. GitHub

  • Set up a new environment in GitHub (note that the convention is that the GH env is 'development' but the stage is 'dev')
  • Add secrets as needed

4. Netlify

  • Set up a Netlify project for the new env with name like goodfit-<env>
  • Copy env vars from an existing env and update them
  • You may need to add some later once things are deployed (Cognito ID, App Runner URL)

5. gf-sourcers repo

Create a PR with the environment config:

  • Add new env and account ID to the serverless base config and stage options
  • Update sops to add the new env, and re-create all the secrets (suggest having dev profile in one terminal, use sops to show the secrets, have the new env in another terminal and set the sops secret)
  • Add a new deploy_<ENV>.yml workflow
  • Update sync_config.yml workflow to add the new env

6. Bootstrap and deploy

  • Deploy bootstrap manually:
    • Run BootstrapRDS
    • Run BootstrapRedshift
    • Run MigrateRDS
    • Run insert test data
    • Run check connections
  • Deploy systemTools manually:
    • Run TestVPC
  • Add new config to /config in gf-sourcers and run configSync in systemTools to populate AWS SSM Parameter Store
  • Merge PR and run the deploy pipeline (may need to run several times as dependency order works out)

At this point you can run TestVPC and TestDBSecretsConnections and everything should pass.

7. Coresignal data setup

Copy data from an existing environment:

mkdir data && cd data
awss gf-dev
aws s3 sync s3://goodfit-coresignal-data-dev/ .

awss gf-<env>
aws s3 sync . s3://goodfit-coresignal-data-<env>/

Run Coresignal step functions:

MultisourceComps   -> {"mode":"MONTHLY","dateOrMonth":"202601"}
ProcessCompanies -> {"mode":"MONTHLY","dateOrMonth":"202509"}
+ {"mode":"MONTHLY","dateOrMonth":"202511"}
ProcessJobsDaily -> {"date": "2025-10-01"}
MembersDaily -> {"mode":"MONTHLY","dateOrMonth":"202510"}
+ {"mode":"DAILY","dateOrMonth":"2025-09-10"}

Run gf-coresignal-dataset-<env>-IdResolverPushToQueue a few times — each invocation adds ~5k new LinkedIn companies to the LinkedIn table.

8. DBT setup

  • Create environment and connection in dbt Cloud
  • Set the full build to run nightly
  • Run it
  • Wait a few hours and re-run DBT

Done

You can provision a test client + market and perform a dataset build. The data set will increase over the next 24 hours or so as data is added and sourced from the coresignal sample.