Documentation Index
Fetch the complete documentation index at: https://docs.cirron.com/llms.txt
Use this file to discover all available pages before exploring further.
Template Generation System
The Cirron CLI uses a sophisticated template generation system that creates framework-specific code based on your chosen template and model type. This system consists of two core components: data loaders and model generators.
Overview
The Cirron CLI template system supports multiple ML framework templates with both inference and training variations:
- PyTorch:
pytorch (inference) and pytorch-train (training pipeline)
- TensorFlow:
tensorflow (inference) and tensorflow-train (training pipeline)
- Scikit-learn:
sklearn (basic model) and sklearn-pipeline (full pipeline)
- Custom:
custom (blank Python project template)
When you run cirron init my-project --template pytorch, the CLI:
- Selects the appropriate template from the available options
- Uses modular template generation functions to create project files
- Generates framework-specific code using dedicated generators
- Assembles the complete project with all necessary files
Template Generation Functions
Template files are generated by modular functions in src/commands/files/:
Core Generation Functions
createCommonMLFiles() - Shared files (Dockerfile, requirements.txt, README, cirron.yaml)
- Framework-specific generators for each ML framework
- Training-specific variants for advanced pipelines
- Data and model utilities for complete project setup
Framework-Specific Generators
createPyTorchFiles() - PyTorch inference templates
createPyTorchTrainingFiles() - PyTorch training pipelines
createTensorFlowFiles() - TensorFlow inference templates
createTensorFlowTrainingFiles() - TensorFlow training pipelines
createSklearnFiles() - Basic scikit-learn models
createSklearnPipelineFiles() - Full scikit-learn pipelines
Utility Functions
createDataFiles() - Data loading and preprocessing utilities
createModelFiles() - Model architecture definitions
Data Loader Generation
The getDataLoaderCode() function generates framework-specific data loading utilities:
PyTorch Data Loader
import torch
from torch.utils.data import Dataset, DataLoader
import pandas as pd
import numpy as np
from PIL import Image
import os
class CustomDataset(Dataset):
def __init__(self, data_path, transform=None):
self.data_path = data_path
self.transform = transform
# Load data based on model type
if os.path.exists(data_path) and data_path.endswith('.csv'):
self.data = pd.read_csv(data_path)
else:
# Handle image directories or other data formats
self.data = self._load_data()
def _load_data(self):
"""Load data from directory or other sources"""
# Implement based on your data structure
return []
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
# Implement data loading logic
if hasattr(self.data, 'iloc'): # DataFrame
row = self.data.iloc[idx]
# Adjust based on your data structure
features = row[:-1].values.astype(np.float32)
target = row[-1]
if self.transform:
features = self.transform(features)
return torch.tensor(features), torch.tensor(target)
else:
# Handle other data types
return torch.randn(10), torch.tensor(0) # Placeholder
def get_data_loaders(config):
"""Create train and validation data loaders"""
batch_size = config.get('batch_size', 32)
# Create datasets
train_dataset = CustomDataset(
data_path=os.path.join(config['data_path'], 'train.csv')
if os.path.exists(os.path.join(config['data_path'], 'train.csv'))
else 'data/sample/sample_data.csv'
)
val_dataset = CustomDataset(
data_path=os.path.join(config['data_path'], 'val.csv')
if os.path.exists(os.path.join(config['data_path'], 'val.csv'))
else 'data/sample/sample_data.csv'
)
# Create data loaders
train_loader = DataLoader(
train_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=2
)
val_loader = DataLoader(
val_dataset,
batch_size=batch_size,
shuffle=False,
num_workers=2
)
return train_loader, val_loader
TensorFlow Data Loader
import tensorflow as tf
import pandas as pd
import numpy as np
import os
def load_data_from_csv(filepath):
"""Load data from CSV file"""
if not os.path.exists(filepath):
# Use sample data if file doesn't exist
filepath = 'data/sample/sample_data.csv'
data = pd.read_csv(filepath)
# Separate features and labels
X = data.iloc[:, :-1].values.astype(np.float32)
y = data.iloc[:, -1].values
return X, y
def create_tf_dataset(X, y, batch_size=32, shuffle=True):
"""Create TensorFlow dataset"""
dataset = tf.data.Dataset.from_tensor_slices((X, y))
if shuffle:
dataset = dataset.shuffle(buffer_size=1000)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.data.AUTOTUNE)
return dataset
def get_data_loaders(config):
"""Create train and validation datasets"""
batch_size = config.get('batch_size', 32)
data_path = config.get('data_path', 'data/')
# Load training data
train_path = os.path.join(data_path, 'train.csv')
if not os.path.exists(train_path):
train_path = 'data/sample/sample_data.csv'
X_train, y_train = load_data_from_csv(train_path)
# Load validation data
val_path = os.path.join(data_path, 'val.csv')
if not os.path.exists(val_path):
# Split training data for validation
split_idx = int(0.8 * len(X_train))
X_val, y_val = X_train[split_idx:], y_train[split_idx:]
X_train, y_train = X_train[:split_idx], y_train[:split_idx]
else:
X_val, y_val = load_data_from_csv(val_path)
# Create datasets
train_dataset = create_tf_dataset(X_train, y_train, batch_size, shuffle=True)
val_dataset = create_tf_dataset(X_val, y_val, batch_size, shuffle=False)
return train_dataset, val_dataset
scikit-learn Data Loader
import pandas as pd
import numpy as np
import os
from sklearn.model_selection import train_test_split
def load_data(data_path):
"""Load data for sklearn models"""
if os.path.isfile(data_path):
# Single file
data = pd.read_csv(data_path)
elif os.path.isdir(data_path):
# Directory with train/val files
train_path = os.path.join(data_path, 'train.csv')
if os.path.exists(train_path):
data = pd.read_csv(train_path)
else:
# Use sample data
data = pd.read_csv('data/sample/sample_data.csv')
else:
# Use sample data as fallback
data = pd.read_csv('data/sample/sample_data.csv')
# Separate features and target
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
return X, y
def prepare_data(data_path, test_size=0.2, random_state=42):
"""Prepare data with train/test split"""
X, y = load_data(data_path)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, random_state=random_state
)
return X_train, X_test, y_train, y_test
Model Generation
The getModelCode() function creates framework-specific model architectures:
PyTorch Models
Classification Model
import torch
import torch.nn as nn
import torch.nn.functional as F
class ClassificationModel(nn.Module):
def __init__(self, input_dim=10, hidden_dim=64, num_classes=2):
super(ClassificationModel, self).__init__()
self.input_dim = input_dim
self.num_classes = num_classes
self.layers = nn.Sequential(
nn.Linear(input_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(hidden_dim, hidden_dim // 2),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(hidden_dim // 2, num_classes)
)
def forward(self, x):
return self.layers(x)
def create_model():
return ClassificationModel()
Regression Model
import torch
import torch.nn as nn
import torch.nn.functional as F
class RegressionModel(nn.Module):
def __init__(self, input_dim=10, hidden_dim=64):
super(RegressionModel, self).__init__()
self.input_dim = input_dim
self.layers = nn.Sequential(
nn.Linear(input_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(hidden_dim, hidden_dim // 2),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(hidden_dim // 2, 1)
)
def forward(self, x):
return self.layers(x)
def create_model():
return RegressionModel()
Computer Vision Model
import torch
import torch.nn as nn
import torch.nn.functional as F
class CNNModel(nn.Module):
def __init__(self, num_classes=10):
super(CNNModel, self).__init__()
self.num_classes = num_classes
self.features = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(64, 128, kernel_size=3, padding=1),
nn.ReLU(),
nn.AdaptiveAvgPool2d((7, 7))
)
self.classifier = nn.Sequential(
nn.Dropout(0.5),
nn.Linear(128 * 7 * 7, 512),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(512, num_classes)
)
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
def create_model():
return CNNModel()
TensorFlow Models
Classification Model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
def create_model(input_dim=10, num_classes=2):
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=(input_dim,)),
layers.Dropout(0.2),
layers.Dense(32, activation='relu'),
layers.Dropout(0.2),
layers.Dense(num_classes, activation='softmax')
])
return model
Regression Model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
def create_model(input_dim=10):
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=(input_dim,)),
layers.Dropout(0.2),
layers.Dense(32, activation='relu'),
layers.Dropout(0.2),
layers.Dense(1)
])
return model
Computer Vision Model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
def create_model(num_classes=10, input_shape=(224, 224, 3)):
model = keras.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(128, (3, 3), activation='relu'),
layers.GlobalAveragePooling2D(),
layers.Dropout(0.5),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(num_classes, activation='softmax')
])
return model
scikit-learn Models
Classification Model
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
import numpy as np
def create_model():
"""Create a classification model"""
return RandomForestClassifier(
n_estimators=100,
max_depth=10,
random_state=42
)
def preprocess_data(X, preprocessor=None, fit=True):
"""Preprocess the input data"""
if preprocessor is None and fit:
preprocessor = StandardScaler()
X_processed = preprocessor.fit_transform(X)
elif preprocessor is not None:
X_processed = preprocessor.transform(X)
else:
X_processed = X
return X_processed, preprocessor
Regression Model
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
import numpy as np
def create_model():
"""Create a regression model"""
return RandomForestRegressor(
n_estimators=100,
max_depth=10,
random_state=42
)
def preprocess_data(X, preprocessor=None, fit=True):
"""Preprocess the input data"""
if preprocessor is None and fit:
preprocessor = StandardScaler()
X_processed = preprocessor.fit_transform(X)
elif preprocessor is not None:
X_processed = preprocessor.transform(X)
else:
X_processed = X
return X_processed, preprocessor
Model Types Supported
The template generation system supports various model types:
Classification
- Use case: Binary and multi-class classification
- Output: Class probabilities
- Loss functions: Cross-entropy, sparse categorical crossentropy
- Metrics: Accuracy, precision, recall, F1-score
Regression
- Use case: Continuous value prediction
- Output: Continuous values
- Loss functions: MSE, MAE
- Metrics: R² score, mean squared error
Computer Vision
- Use case: Image classification, object detection
- Architecture: Convolutional Neural Networks (CNNs)
- Input: Image tensors
- Output: Class predictions
Custom
- Use case: Specialized models, research prototypes
- Architecture: User-defined
- Flexibility: Maximum customization
Template Generation Process
1. Template Selection
cirron init my-project --template pytorch
The model type (classification, regression, etc.) is selected interactively.
2. Code Generation
The CLI calls the appropriate generation functions:
// Generate data loader
const dataLoaderCode = getDataLoaderCode('pytorch', 'classification');
// Generate model code
const modelCode = getModelCode('pytorch', 'classification');
// Generate training files
await createPyTorchTrainingFiles(projectPath, projectName, options);
3. File Assembly
The system creates the complete project structure:
my-project/
├── src/
│ ├── model.py # Generated model architecture
│ ├── data_loader.py # Generated data loading utilities
│ ├── train.py # Generated training script
│ └── inference.py # Generated inference script
├── requirements.txt # Framework-specific dependencies
├── Dockerfile # Container configuration
└── cirron.yaml # Project configuration
Customization Points
Adding New Model Types
To add a new model type (e.g., NLP, time series):
- Update model generation functions:
export function getPyTorchModelCode(modelType: string): string {
if (modelType === 'nlp') {
return generateNLPModel();
}
// ... existing code
}
- Add data loader support:
export default function getDataLoaderCode(framework: string, modelType: string): string {
if (modelType === 'nlp') {
return generateNLPDataLoader(framework);
}
// ... existing code
}
Framework Extensions
To add support for a new framework:
- Create framework-specific functions:
export function getJAXModelCode(modelType: string): string {
// Generate JAX model code
}
export function getJAXDataLoaderCode(modelType: string): string {
// Generate JAX data loader code
}
- Update the main generation function:
export default function getModelCode(framework: string, modelType: string): string {
if (framework === 'jax') {
return getJAXModelCode(modelType);
}
// ... existing code
}
Best Practices
Data Loading
- Handle missing data gracefully: Provide fallback to sample data
- Support multiple formats: CSV, images, custom formats
- Optimize for framework: Use framework-specific optimizations
- Include preprocessing: Handle scaling, normalization, etc.
Model Architecture
- Start simple: Begin with basic architectures
- Add regularization: Include dropout, batch normalization
- Consider model type: Optimize architecture for the task
- Provide flexibility: Allow parameter customization
Code Quality
- Consistent interfaces: Standardize method signatures
- Error handling: Include proper error messages
- Documentation: Add clear docstrings and comments
- Testing: Include example usage and validation
Next Steps