original Pointer Networks paper was originally accepted into the NeurIPS 2015 program, which made it quite old during years of in-depth learning. Nevertheless, it has so far collected more than 1,700 citations and further integrated into modern solutions[2, 3], has received many improvements [4, 5], and has inspired alternative architectures. It even plays a small but important role in the state-of-the-art StarCraft II game model created by Tencent AI Lab. . What is in pointer networks, what makes them so applicable today?
This simplee and the sleek architecture handles subtle complications in sequence predictions. Suppose we want to predict the index order over the input sequence. What do we do if the feed length varies? With previous methods, the dictionary size (i.e., input sequence indices) must be validated a priori. This is suitable for problems such as sentence creation where the input dictionary is a set of known characters. Combinatorial problems, on the other hand, are within pointer networks. With the help of pointer networks we can solve combinatorial optimization problems where the object is a set of indices for defined income at the moment of reasoning, not during training. Pointer mesh paper covers three such functions: planar convex frames, Delaunay triangulation, and the symmetric planar Traveling Salesman problem.
There are many implementations of pointer networks on GitHub (here you can find a handful of linked ones) send), but most of them were released soon after the paper and use LSTM or GRU-based encoders, which have since fallen with the success of transformers. This message comes with a Jupyter notebook that you can keep track of here. Much of the code has been customized pointer-networks-pytorch repo from GitHub user ast0414, which solves the integer sort problem. I extend to my previous work by solving the problem of a planar convex body and replace it with LSTM-based encoder and decoder transformers.
As previously mentioned, pointer networks are used for composite perturbations, such as resolving a planar convex body. Pointer networks are also visible in several new applications. In TStarBot-X, an artificial intelligence agent for playing StarCraft II (and its predecessor) AlphaStar created by DeepMind), the pointer network is used to select target units for a particular function. PolyGen, a unique generative model for 3D eyes (of which I have written more here), uses a pointer grid to point vertices to a specific face, pointing to one face at a time until a grid topology is formed. The pointer grid is even used to select key phrases that condense a document wellA quick concise summary by confirming the rewriting of the selected sentence. “In this post, we show how to solve the planar convex frame problem with pointer networks.
Pointer networks effectively create a monitoring mechanism for a varying number of tokens. The given string (t0, t1,…, tn-1), the pointer network participates in the candidate vocabulary of the next ID tn. Each of these input candidates is associated with an embedding vector produced by the encoder. Similarly, sequence identifiers each have their own decoder embeddings. Unlike other sequence prediction models, pointer networks are constructed so that the number of input candidates can change at the time of reasoning.
A key innovation in the pointer network is a well-designed monitoring mechanism. The formula pair below indicates the core of the pointer network logic. It first consists of two learning weights multiplied by the encoder embedding and the decoder embedding, followed by nonlinearity and multiplication by the second learned weight. v which reduces the feature dimension. This layer essentially combines the features of input and target immersions from the activations of the bilateral chart corresponding to the input-target pairs. Finally, the softmax layer expresses a conditional probability for the entire dimension corresponding to our bets. What is left for us (for each batch line) is the probability line between the input tags in any given order.
How do we translate this expression into code? The following code snippet shows how to enable the pointer network as a PyTorch module.
mask are the input inputs generated by the transformer encoder, the target mergers of the transformer decoder, and a binary mask covering the input pads, respectively. The weighting factors are implemented as linear layers. In addition, the softmax layer is covered so that the padding and previously predicted symbols can be ignored.
This module serves as the “task” layer of the overall architecture of the convex frame problem. However, it is designed for general use and can be applied to other combinatorial problems involving a number of input and target embedding.
Suppose we have a cork board covered with randomly placed thumbs. If we stretch a thin elastic band around these thumbs so that it surrounds all the thumbs, the elastic band would only touch a portion of the thumb. This subset forms a convex series over entry points. Give an even set of points S, a planar convex body is the smallest subset of points Sc which forms a convex set for all points. An area covered by a polygon formed by any subset of points S in fact, it covers only part of the area of this convex body. Thus, the area covered by the convex body is a subset of all points in the set S.
Convex hulls have numerous applications in fields as diverse as mathematics, statistics, computer graphics, and ethology. However, we only take care of convex frames that act as an interesting toy problem in pointer nets. The convex body of a set of points is a combination problem that is well suited to be solved with pointer networks, and we can easily produce an unlimited amount of training data for it only Numpy and Shapely.
Synthetically evenly distributed samples are created at the (x, y) level for use as data. Each of these input points is input to the encoder to produce an embedding that acts as an input dictionary. One hot coding of two control codes is also added:
<sos> by indicating the beginning of the period and
<eos> to the end of the sequence. The size of this input dictionary therefore depends on the number of entry points among us as well as the two control codes. The function of the pointer network is to select symbols from this dictionary to form the order of the convex body. See the Google Colab Notebook for more information.
In the Colab notebook, we train the neural network for 100,000 randomly generated input-output pairs with input lengths ranging from 5 to 50. Validation and testing kits of 1,000 and 10,000 lengths, respectively, are also created using the same method. This method achieves a test accuracy of approximately 81.3% (article reports 69.6%) and an overlap of 99.9% (article reports 99.9%).
The convex frames it produces are very close to the solutions offered by Shapely. In most cases where errors are made, the predicted convex body typically overlaps with the solution by at least 99%, especially for longer periods (because individual points have less effect on the total surface area). Try the included Colab notebook yourself and study the code.
Pointer networks are a simple but versatile architecture that can be incorporated into neural networks to predict sequences with input vocabularies of varying lengths. This post focused on the problem of a planar convex body, but they can be applied to all problems where we want to predict a set of indices that refer to a coding set.
 Oriol Vinyals, Meire Fortunato, Navdeep Jaitly – Pointer networks (2015), NeurIPS 2015
 Yen-Chun Chen, Mohit Bansal – A quick concise summary by confirming the selected sentence rewrite (2018), ACL 2018
 Charlie Nash, Yaroslav Saw, SM Ali Eslami, Peter W.Battaglia – PolyGen: Autoregressive generative model of 3D loops (2020), ICML 2020
 Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel – Learning Python code suggestion for a sparse pointer network (2016), ICLR 2017 transmission
 Yongjing Yin, Fandong Meng, Jinsong Su, Yubin Ge, Lingeng Song, Jie Zhou, Jiebo Luo – Improving the pointer network in sentence order using pairwise predictions (2020), AAAI 2020
 Wouter School, Herke van Hoof, Max Welling – Attention, learn how to solve routing problems! (2019), ICLR 2019
 Lei Han, Jiechao Xiong, Peng Sun, Xinghai Sun, Meng Fang, Qingwei Guo, Qiaobo Chen, Tengfei Shi, Hongsheng Yu, Xipeng Wu, Zhengyou Zhang – TStarBot-X: An open and comprehensive study for effective league practice throughout StarCraft II (2020), Tencent AI Lab
 Oriol Vinyals, Igor Babuschkin, David Silver – Grandmaster level in StarCraft II through multi-agent reinforcement learning (2019), Nature