Problem Statement
Explain test parallelization strategies to reduce CI/CD pipeline execution time. Include parallel execution approaches, test splitting, and handling shared resources.
Explanation
Test parallelization runs tests concurrently reducing total execution time. Strategies: process-level parallelization (multiple test processes), pipeline-level parallelization (parallel CI jobs), test sharding (split tests across machines).
Jest parallel execution (default):
```javascript
// jest.config.js
module.exports = {
maxWorkers: '50%', // Use 50% of CPU cores
// or maxWorkers: 4 // Fixed number of workers
};
// Run sequentially if needed
test.serial('sequential test', () => {
// Test that must run alone
});
```
pytest parallel with pytest-xdist:
```bash
# Install
pip install pytest-xdist
# Run with 4 workers
pytest -n 4
# Auto-detect CPUs
pytest -n auto
# Load balancing
pytest -n auto --dist loadscope
```
Test sharding splits tests across CI jobs:
```yaml
# GitLab CI
test:
stage: test
parallel: 4
script:
- npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
```
GitHub Actions matrix:
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
Playwright sharding:
```javascript
// playwright.config.js
export default {
workers: process.env.CI ? 4 : undefined,
fullyParallel: true,
};
// Run specific shard
// npx playwright test --shard=1/4
```
Intelligent test splitting based on duration:
```yaml
# CircleCI with test splitting
test:
parallelism: 4
steps:
- run:
command: |
circleci tests glob "tests/**/*.test.js" | \
circleci tests split --split-by=timings | \
xargs npm test
```
Database isolation for parallel tests:
```javascript
// Create separate database per worker
const workerId = process.env.JEST_WORKER_ID || '1';
const databaseName = `test_db_${workerId}`;
beforeAll(async () => {
await createDatabase(databaseName);
await runMigrations(databaseName);
});
afterAll(async () => {
await dropDatabase(databaseName);
});
```
Testcontainers with port mapping:
```java
@Container
private static PostgreSQLContainer<?> postgres =
new PostgreSQLContainer<>("postgres:13")
.withDatabaseName("testdb")
// Random port avoids conflicts
.withExposedPorts(5432);
```
Shared resource management with locks:
```javascript
const lockfile = require('proper-lockfile');
test('test requiring exclusive resource', async () => {
const release = await lockfile.lock('/tmp/test-resource.lock');
try {
// Use shared resource exclusively
await useSharedResource();
} finally {
await release();
}
});
```
Test categorization for parallel execution:
```javascript
// Fast tests (unit tests)
describe('Unit Tests', () => {
test('fast test 1', () => {});
test('fast test 2', () => {});
});
// Slow tests (integration tests)
describe('Integration Tests', () => {
test('slow test 1', async () => {});
test('slow test 2', async () => {});
});
```
Pipeline with parallel stages:
```yaml
stages:
- test
unit-tests:
stage: test
script:
- npm run test:unit
parallel: 2
integration-tests:
stage: test
script:
- npm run test:integration
parallel: 4
e2e-tests:
stage: test
script:
- npm run test:e2e
parallel: 8
```
Dynamic test allocation:
```python
# Split tests based on previous run times
import json
import sys
with open('test-timings.json') as f:
timings = json.load(f)
tests = sorted(timings.items(), key=lambda x: x[1], reverse=True)
shard = int(sys.argv[1]) # Current shard
total_shards = int(sys.argv[2])
# Distribute tests evenly by duration
shard_tests = tests[shard::total_shards]
print(' '.join([t[0] for t in shard_tests]))
```
Handling test dependencies:
```javascript
// Tests with dependencies run sequentially
describe.serial('Order-dependent tests', () => {
let userId;
test('create user', async () => {
userId = await createUser();
});
test('update user', async () => {
await updateUser(userId);
});
test('delete user', async () => {
await deleteUser(userId);
});
});
```
Optimization strategies:
1. Profile test suite identifying slow tests
2. Run fast tests first (fail fast)
3. Balance shards by test duration
4. Use test-level parallelization for independent tests
5. Use job-level parallelization for test categories
6. Optimize slow tests before parallelizing
7. Monitor parallel execution efficiency
Challenges and solutions:
- Race conditions: isolate test data, use unique identifiers
- Resource contention: use separate resources per worker
- Flaky tests: fix root cause, don't just parallelize
- Uneven shard duration: split by test duration not count
- Setup overhead: cache dependencies, optimize startup
Best practices: ensure test independence, isolate data, use random ports, implement proper cleanup, monitor execution times, balance shards intelligently, fail fast on critical tests. Understanding parallelization reduces feedback time from hours to minutes.