Tensors
- Scalar (Rank-0)
- Vector (Rank-1)
- Matrix (Rank-2)
- Higher Rank
Properties of tensors
- shape
- Data type
- vaules
Operations
tensor_a= tf.constant([[1,3],[3,4]])
tensor_b= tf.constant([[1,3],[3,4]])
tf.add(tensor_a,tensor_b)
tf.sub(tensor_a,tensor_b)
tf.mul(tensor_a,tensor_b)
# matrix mul
tf.matmul(tensor_a,tensor_b)
constant and variables
In TensorFlow, constants are values that don’t change during the execution of the program. They are useful for creating nodes in the computational graph with fixed values. Constants are created using tf.constant
.
Variables, on the other hand, maintain state across sessions and are used for parameters in machine learning models. They can be updated during the training process. Variables are created using tf.Variable
import tensorflow as tf
# Define a constant
a = tf.constant(5, dtype=tf.int32)
b = tf.constant(6, dtype=tf.int32)
# Perform an operation
c = tf.add(a, b)
# Initialize a session to run the graph
with tf.Session() as sess:
result = sess.run(c)
print(result) # Output: 11
# Define a variable
weight = tf.Variable(0.5, dtype=tf.float32)
# Perform an operation
new_weight = weight * 2
# Initialize all variables
init = tf.global_variables_initializer()
# Initialize a session to run the graph
with tf.Session() as sess:
sess.run(init)
result = sess.run(new_weight)
print(result) # Output: 1.0
Reductions
x = tf.constant([[1, 2, 3], [4, 5, 6]])
print(tf.reduce_sum(x)) # total sum = 21
print(tf.reduce_sum(x, axis=0)) # column-wise sum = [5, 7, 9]
print(tf.reduce_sum(x, axis=1)) # row-wise sum = [6, 15]
print(tf.reduce_mean(x)) # mean of all elements
print(tf.reduce_max(x)) # max element
print(tf.reduce_min(x)) # min element
Reshaping & Transpose
x = tf.constant([[1, 2, 3], [4, 5, 6]])
print(tf.reshape(x, (3, 2))) # reshape to 3x2
print(tf.transpose(x)) # transpose matrix
x = tf.constant([1, 2, 3, 4, 5])
print(tf.square(x)) # elementwise square
print(tf.sqrt(tf.cast(x, tf.float32))) # sqrt (needs float type)
print(tf.exp(x)) # exponential
print(tf.math.log(x)) # natural log
#Random tensors
print(tf.random.normal((2,2), mean=0, stddev=1)) # Gaussian
print(tf.random.uniform((2,2), minval=0, maxval=10)) # Uniform
Graphs and sessions
A computational graph is a series of TensorFlow operations arranged into a graph of nodes. Each node represents an operation, and edges represent the data (tensors) flowing between these operations.
TensorFlow internally represents our code as a graph:
- Nodes = operations
- Edges = tensors flowing between them
Two modes:
- Eager execution → (default) Python runs ops one by one. Easy to debug, but slower.
- Graph execution → TensorFlow converts your Python code into a graph → optimizes it → runs faster on GPU/TPU.
Sessions
NOTE:Before 2.0 we need to do like this but current version all will automatically handled in session
A session is used to execute the operations defined in the computational graph. It allocates resources (such as GPU memory) and manages the execution of operations.
import tensorflow as tf
# Define a simple computational graph
a = tf.constant(5)
b = tf.constant(3)
c = tf.add(a, b)
# Create and run a session to execute the graph
with tf.Session() as sess:
result = sess.run(c)
print(result) # Output: 8
- Graph: The blueprint of operations and data flow.
- Session: The runtime environment for executing the graph.
Graphs and Sessions in TensorFlow 2.x
In TensorFlow 2.x, eager execution is enabled by default, making the framework more intuitive and user-friendly. Operations are executed immediately as they are called from Python. However, you can still create graphs and sessions using tf.function
for more complex or performance-critical scenarios.
import tensorflow as tf
# Define a function that builds a graph
@tf.function
def compute(a, b):
return tf.add(a, b)
# Call the function
result = compute(5, 3)
print(result) # Output: 8
TensorFlow traces Python function into a computation graph optimized, parallelized, portable.
Transitioning from TensorFlow 1.x to 2.x
- In TensorFlow 1.x, you explicitly define the graph and then create a session to execute it.
- In TensorFlow 2.x, eager execution is the default mode, and you can use
tf.function
to create a graph if needed.
@tf.function
def my_func(x):
print('Tracing.\n')
return tf.reduce_sum(x)
tf.function
is the bridge between the two worlds.
When you decorate a Python function with @tf.function
:
- First call → tracing phase
- TensorFlow watches all TensorFlow ops (
tf.add
,tf.reduce_sum
, etc.) executed inside. - Instead of just running them, it records them into a computation graph (a
tf.Graph
). - Python-native code (like
print
, loops using normal Python lists) isn’t recorded, because that can’t be exported into a graph.
- TensorFlow watches all TensorFlow ops (
- Subsequent calls → execution phase
- Instead of rerunning Python, TensorFlow just runs the optimized graph.
- That’s why your second call didn’t print “Tracing” the Python
print
was not part of the graph.
With a graph, we have a great deal of flexibility. You can use your TensorFlow graph in environments that don’t have a Python interpreter, like mobile applications, embedded devices, and backend servers. TensorFlow uses graphs as the format for saved models when it exports them from Python.
Graphs are also easily optimized, allowing the compiler to do transformations like:
- Statically infer the value of tensors by folding constant nodes in your computation (“constant folding”).
- Separate sub-parts of a computation that are independent and split them between threads or devices.
- Simplify arithmetic operations by eliminating common subexpressions.
Gradients
Training = adjust weights to reduce error.
We need gradients = derivatives of loss w.r.t. variables.
TensorFlow uses automatic differentiation with tf.GradientTape
:
- During forward pass, TensorFlow records operations onto a “tape”.
- During backward pass, it replays the tape to compute derivatives (chain rule).
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
y = x**2 # y = x^2
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy()) # 6.0 (since derivative of x^2 is 2x = 6)
W = tf.Variable(2.0)
b = tf.Variable(1.0)
x = tf.constant(3.0)
with tf.GradientTape() as tape:
y = W * x + b # model
loss = (y - 10)**2 # squared error
dW, db = tape.gradient(loss, [W, b])
print(dW.numpy(), db.numpy())
Tensorboard
TensorBoard is a suite of visualization tools provided by TensorFlow that enables you to inspect and understand your machine learning workflows.
Visualizing the computational graph
# below is version 1
import tensorflow as tf
# Reset the default graph
tf.reset_default_graph()
# Define the computational graph
a = tf.constant(5, name='a')
b = tf.constant(3, name='b')
c = tf.add(a, b, name='c')
# Create a summary to visualize in TensorBoard
writer = tf.summary.FileWriter('./logs', tf.get_default_graph())
with tf.Session() as sess:
result = sess.run(c)
print(result) # Output: 8
# Close the writer
writer.close()
# below is version 2
import tensorflow as tf
# Define the function
@tf.function
def my_func(a, b):
return tf.add(a, b, name='c')
# Create a summary writer
logdir = './logs'
writer = tf.summary.create_file_writer(logdir)
# Trace the function and log the graph
tf.summary.trace_on(graph=True, profiler=True)
result = my_func(tf.constant(5), tf.constant(3))
with writer.as_default():
tf.summary.trace_export(name="my_func_trace", step=0, profiler_outdir=logdir)
tensorboard --logdir=./logs
Keras
Keras is the high-level API of the TensorFlow platform. It provides an approachable, highly-productive interface for solving machine learning (ML) problems, with a focus on modern deep learning. Keras covers every step of the machine learning workflow, from data processing to hyperparameter tuning to deployment. It was developed with a focus on enabling fast experimentation.
Layers and Models
- Layer = one graph node with weights + computation.
- Model = a graph of layers (usually DAG).
So Keras is essentially a graph-building DSL (domain-specific language), where you describe how data flows through transformations (layers).
Components of a Layer:
- Weights (parameters):
- Created automatically when the layer first sees input shape.
- Can be trainable (
kernel
,bias
) or frozen (like in BatchNorm’s running averages).
- Computation (
call
method):- Defines forward pass (how input → output).
- Example: Dense layer =
output = activation(input @ W + b)
.
- State:
- Some layers (like BatchNorm, Dropout) also carry state that affects computation even during inference.
A Model is “just” a special kind of Layer, but with:
- Multiple inputs/outputs possible.
- Built-in training, evaluation, prediction loops.
Types:
- Sequential model
- Straight chain of layers.
- Easy, but limited (no branching, no multiple inputs).
- Example:
Sequential([Dense(64), ReLU(), Dense(10)])
.
- Functional API model
- Treats layers like functions:
y = Dense(64)(x)
. - we explicitly connect inputs → outputs.
- Example: multi-input, multi-output models.
- Treats layers like functions:
- Subclassing API
- Inherit
tf.keras.Model
, define__init__
(layers) andcall
(forward pass). - Full control, allows non-standard architectures (like custom RNN loops).
- More code, but maximum flexibility.
- Inherit
Models come with built-in training workflow so you don’t always need to write loops from scratch:
model.compile
- Choose optimizer, loss function, metrics.
model.fit
- Iterates over data for N epochs.
- Handles batching, shuffling, callbacks, distributed training.
model.evaluate
- Runs model on test data, returns loss + metrics.
model.predict
- For inference only (no weight updates).
Sequential Model
The Sequential model (tf.keras.Sequential
) is literally a list (stack) of layers applied one after the other.
It’s designed for architectures where data flows in one straight line:
Input → Layer 1 → Layer 2 → Layer 3 → Output
No branching, no merging just a straight path.
- Weight and bias will be handled by itself
Method 1: Pass a list of layers at once
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(32, activation='relu', input_shape=(64,)), # first layer needs input shape
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
Method 2: Add layers one by one
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(32, activation='relu', input_shape=(64,)))
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
model.layers
→ list of all layers.model.summary()
→ table with shapes & parameters.- First layer requires input shape → so TensorFlow knows tensor dimensions.
- Once input is defined, the model automatically infers shapes for later layers.
Training Workflow
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=32)
model.evaluate(x_test, y_test)
predictions = model.predict(x_test)
Functional API
Build models as arbitrary directed acyclic graphs (DAGs) of layers.
- Multiple inputs or outputs.
- Non-linear topology (skip connections, branching, merging).
- More flexibility (e.g., ResNet, Inception).
from tensorflow.keras import layers, Model, Input
inputs = Input(shape=(784,))
x = layers.Dense(64, activation="relu")(inputs)
x1 = layers.Dense(32, activation="relu")(x)
x2 = layers.Dense(16, activation="relu")(x)
concat = layers.concatenate([x1, x2])
outputs = layers.Dense(10, activation="softmax")(concat)
model = Model(inputs=inputs, outputs=outputs)
the graph is not a straight line: the flow splits, transforms separately, then merges.
Note: Dense mean all output from one side of neuron is connected with other layer input neurons
Model Subclassing (Custom Models)
Define a Python class that inherits from tf.keras.Model
, implement __init__
(for layers) and call()
(for forward pass).
from tensorflow.keras import layers, Model
class MyModel(Model):
def __init__(self):
super().__init__()
self.d1 = layers.Dense(64, activation="relu")
self.d2 = layers.Dense(10, activation="softmax")
def call(self, inputs):
x = self.d1(inputs)
return self.d2(x)
model = MyModel()
No automatic graph construction. we directly control forward computation in call()
.
Layers
Layer Type | Example Layer | Mechanistic Purpose |
---|---|---|
Dense | Dense | Linear transform + activation |
Recurrent | SimpleRNN, LSTM, GRU | Sequence modeling with hidden states |
Convolution | Conv1D, Conv2D | Spatial feature extraction |
Pooling | MaxPooling, AvgPooling | Downsample, invariance |
Normalization | BatchNorm, LayerNorm | Stabilize training, normalize |
Dropout | Dropout | Regularization |
Embedding | Embedding | Token → dense vector mapping |
Attention | MultiHeadAttention | Sequence weighting, context modeling |
Utility | Flatten, Reshape, Concat | Reshape/combine tensors |