{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Suppose you have a factor graph with variables $X,Y,Z$ and a factor $f(X,Y,Z)$. You want to update the factor-to-variable message from $f$ to $X$ given by the formula,\n", "\\begin{equation}\n", " \\mu_{f\\rightarrow x}(x) = \\sum_z \\sum_y f(x,y,z) \\mu_{z \\rightarrow f}(z) \\mu_{y \\rightarrow f}(y)\n", "\\end{equation}\n", "In our example, assume that variables can take the following values:\n", "\\begin{equation}\n", " X \\in \\{1, \\ldots, 8\\} \\qquad Y \\in \\{1,2\\} \\qquad Z \\in \\{1,2,3\\}\n", "\\end{equation}\n", "We will start by defining our factor and incoming messages. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mu_y2f shape: (2,)\n", "mu_z2f shape: (3,)\n", "Factor shape: (8, 2, 3)\n" ] } ], "source": [ "mu_y2f = np.random.rand(2)\n", "mu_z2f = np.random.rand(3)\n", "f = np.random.rand(8,2,3)\n", "print('mu_y2f shape: ', mu_y2f.shape)\n", "print('mu_z2f shape: ', mu_z2f.shape)\n", "print('Factor shape: ', f.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The values of the messages and factor ar arbitrary, so I just chose random values. Importantly, the size of the factor $f(x,y,z)$ is $8 \\times 2 \\times 3$. This is a multidimensionaly array whose axes align with each of the arguments. The messages are 1D arrays of the same length as the corresponding variables. \n", "\n", "Now we use Numpy.tensordot to compute the product of incoming messages,\n", "\\begin{equation}\n", " \\mu_{z \\rightarrow f}(z) \\mu_{y \\rightarrow f}(y)\n", "\\end{equation}" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "msg_prod shape: (2, 3)\n" ] } ], "source": [ "msg_prod = np.tensordot(mu_y2f, mu_z2f,axes=0)\n", "print('msg_prod shape: ', msg_prod.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Numpy.tensordot with argument axis=0 automatically aligns dimensions of its argument. The product is a 2D array of size $2 \\times 3$. The element tmp_msg[i,j] = mu_y2f[i] * mu_z2f[j]. This product aligns with the 2nd and 3rd dimensions of the factor. We will now replicate this across the first dimension (the X dimension)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tmp_msg shape: (8, 2, 3)\n" ] } ], "source": [ "tmp_msg = np.tile(msg_prod, (8, 1, 1))\n", "print('tmp_msg shape: ', tmp_msg.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, tmp_msg has the same shape as the factor. The Numpy.tile function creates 8 copies of the 2D array, where each copy is aligned to the 1st axis. That means, each slice along the first axis is identical, which can check..." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Check 1: [[0.12741917 0.13021238 0.84547282]\n", " [0.02054304 0.02099337 0.1363106 ]]\n", "Check 2: [[0.12741917 0.13021238 0.84547282]\n", " [0.02054304 0.02099337 0.1363106 ]]\n", "The same!\n" ] } ], "source": [ "# Choose two slices arbitrarily\n", "check_1 = np.squeeze(tmp_msg[3,:,:])\n", "check_2 = np.squeeze(tmp_msg[7,:,:])\n", "print('Check 1: ', check_1)\n", "print('Check 2: ', check_2)\n", "print('The same!')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that everything is the same size, we can compute the final message. We first multiply the factor with the replicated message product element-wise:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tmp_msg shape: (8, 2, 3)\n" ] } ], "source": [ "tmp_msg = f * tmp_msg\n", "print('tmp_msg shape: ', tmp_msg.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we just sum over the 2nd and 3rd axes (corresponding to $y$ and $z$). The result is the factor-to-variable message $\\mu_{f \\rightarrow x}(x)$." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Message: [0.24521528 0.60436252 1.18748229 0.34408828 0.74763862 0.431988\n", " 0.56450834 0.46566119]\n", "Shape: (8,)\n" ] } ], "source": [ "mu_f2x = np.sum(tmp_msg, axis=(1,2))\n", "print('Message: ', mu_f2x)\n", "print('Shape: ', mu_f2x.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the final message $\\mu_{f \\rightarrow x}(x)$ is an array of length 8. This is because it is a function of the variable $X$ which takes on 8 values. One last step is we should normalize the message, to avoid underflow later on..." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Message: [0.05341282 0.13164231 0.25865751 0.07494934 0.16285072 0.09409567\n", " 0.12296126 0.10143037]\n" ] } ], "source": [ "mu_f2x = mu_f2x / np.sum(mu_f2x) # can divide by a constant\n", "print('Message: ', mu_f2x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That is one way to compute the factor-to-variable message. Note that we can use Numpy broadcasting to avoid explicitly using Numpy.tile(). " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.05341282, 0.13164231, 0.25865751, 0.07494934, 0.16285072,\n", " 0.09409567, 0.12296126, 0.10143037])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Compute factor-to-variable message\n", "mu_f2x_again = f * msg_prod[np.newaxis, :, :] # <== Braodcast 1st dim.\n", "mu_f2x_again = np.sum(mu_f2x_again, axis=(1,2))\n", "mu_f2x_again /= np.sum(mu_f2x_again) # Normalize\n", "mu_f2x_again " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Observe that we get the same result with fewer lines of code, and better memory usage. We can write even shorter and more efficient code by using Numpy.tensordot(). The following lines compute the same message. The axes=((1,2),(0,1)) argument tells Numpy to align the 1st and 2nd axes of the factor with the 0th and 1st axes of the incoming message product (the Y and Z dimensions), then sum them out..." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.05341282, 0.13164231, 0.25865751, 0.07494934, 0.16285072,\n", " 0.09409567, 0.12296126, 0.10143037])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mu_f2x_again = np.tensordot(f, msg_prod, axes=((1,2),(0,1)))\n", "mu_f2x_again /= np.sum(mu_f2x_again) # Normalize\n", "mu_f2x_again " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What if we want to send a message from factor $f$ to variable $y$? In this case, $y$ is the 2nd dimension (1st axis) of the factor array. We just need to keep track of which variables are on each dimension. In this case, the formula is,\n", "\\begin{equation}\n", " \\mu_{f \\rightarrow y}(y) = \\sum_x \\sum_z f(x,y,z) \\mu_{x\\rightarrow f}(x) \\mu_{z\\rightarrow f}(z)\n", "\\end{equation}\n", "We just need to make sure to keep track of dimensions and align them properly..." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "msg_prod_2 shape: (8, 3)\n", "mu_f2y: [0.52465845 0.47534155]\n", "mu_f2y_again: [0.52465845 0.47534155]\n" ] } ], "source": [ "# random x-to-f message\n", "mu_x2f = np.random.rand(8)\n", "\n", "# compute product of incoming messages\n", "msg_prod_2 = np.tensordot(mu_x2f, mu_z2f, axes=0)\n", "print('msg_prod_2 shape: ', msg_prod_2.shape)\n", "\n", "# compute message with broadcasting\n", "tmp_msg_2 = f * msg_prod_2[:,np.newaxis,:]\n", "mu_f2y = np.sum(tmp_msg_2, axis=(0,2)) # sum over X and Z dimenions\n", "mu_f2y /= np.sum(mu_f2y) # normalize\n", "print('mu_f2y: ', mu_f2y)\n", "\n", "# compute using tensordot\n", "mu_f2y_again = np.tensordot(f, msg_prod_2, axes=((0,2), (0,1))) # align (0,2) axes with (0,1) axes and sum\n", "mu_f2y_again /= np.sum(mu_f2y_again) # normalize\n", "print('mu_f2y_again: ', mu_f2y_again)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Observe that either method yields the same result. Use whichever makes the most sense for you." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 5 }