How does a 1-dimensional convolution layer feed into a max pooling layer neural network?

I'm having some trouble mentally visualizing how a 1-dimensional convolutional layer feeds into a max pooling layer. I'm using Python 3.6.3 and Keras 2.1.2 with Tensorflow 1.4.0 backend.

In [1]: # Build model
   ...: # ===========
   ...: NUMBER_OF_POSITIONS = 500
   ...: # Input Layer
   ...: input_layer = Input(shape=(NUMBER_OF_POSITIONS,4))
   ...: # Hidden Layers
   ...: _h = Conv1D(320, 16, strides=1, activation="relu")(input_layer)
   ...: _h = MaxPooling1D(pool_size=8, strides=8)(_h)
   ...: _h = SimpleRNN(128, return_sequences = True, activation="tanh")(_h)
   ...: _h = Flatten()(_h)
   ...: _h = Dense(1)(_h)
   ...: # Output layer
   ...: output_layer = Activation("sigmoid")(_h)
   ...:
   ...: model = Model(input_layer, output_layer, )
   ...: model.compile(optimizer="sgd", loss="categorical_crossentropy",)
   ...: model.summary()
   ...:
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 500, 4)            0
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 485, 320)          20800
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 60, 320)           0
_________________________________________________________________
simple_rnn_1 (SimpleRNN)     (None, 60, 128)           57472
_________________________________________________________________
flatten_1 (Flatten)          (None, 7680)              0
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 7681
_________________________________________________________________
activation_1 (Activation)    (None, 1)                 0
=================================================================
Total params: 85,953
Trainable params: 85,953
Non-trainable params: 0
_________________________________________________________________

My understanding of a convolution in 1-dimensions with a kernel_size=16 and stride=1 is essentially stepping through a vector to get 16 element long subsets adjacent to each other. For example, below:

# Generate 20 element sequence then step through window size of 16 by 1
filter_size= 16
# https://pastebin.com/k7tzHvYN

# ATCATTTTCTCGATGAAAGC
# ====================
# ATCATTTTCTCGATGA
#  TCATTTTCTCGATGAA
#   CATTTTCTCGATGAAA
#    ATTTTCTCGATGAAAG
#     TTTTCTCGATGAAAGC

(1) What is the difference between convolutions, feature maps, and filters?

(2) Is my pooling layer reducing the dimensionality from 16 to 8 by taking the maximum value for every 2 positions? I'm also not sure how strides apply to this part of the pipeline

(3) How does max pooling work with One-Hot Encoded categorical variables? For example, [[1 0 0 0],[0 0 0 1], [0 1 0 0]]

Response to Alex R's answer below:

Terminology is wishy-washy here, but in this case the feature maps refer to the outputs of each convolution (filter). In this case you are applying 1x16 convolutions, stride 1, to your input of size 500x4, which gives you 500-16+1=485 positions to apply the convolution. Note that since your image depth is 4, then each convolution has 1x16x4 weights total.

So in this case, each (1 x 16) convolution is a filter that has 4 channels. Is the feature map going to be the connections between all of the (1 x 16) pixels in the input vector and their maximum value that is determined in the pooling layer? Or would these be the weights. Also, did you mean the output of the convolution is (485, 320) or was alluding to the 5 that were dropped?

You are then applying a maxpool of size 8, with stride 8, meaning that each 8x8 cell will condense into a single cell whose value is the maximum. With a stride of 8, you will do 60 maxpools (up to pixel 480). I believe the last 5 pixels are just thrown out.

This is where I'm lost a little bit. If the initial input was (500,4) what does the (8,8) cell represent? Would this be grouping 8 filters together? I understand that 8*60 would be 480 but I'm having trouble understanding what the strides are in this case since the input are convolutions. Would this be a set of 8 convolutions (pool_size) and then skipping the current 8 (stride) and going to the next 8 to apply the maximum?

Your question about one-hot encoded vector maxpooling is strange. Max-pool would operate no differently in that case, taking the maximum value in max pool region size.

Sorry, most examples I've seen use RGB values where are usually not in {0,1} so the maximum makes more sense to me. Would the maximum be per channel in that filter?

How does a 1-dimensional convolution layer feed into a max pooling layer neural network?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112