DecodableNnetSimple Class Reference

#include <nnet-am-decodable-simple.h>

Collaboration diagram for DecodableNnetSimple:

Public Member Functions

 DecodableNnetSimple (const NnetSimpleComputationOptions &opts, const Nnet &nnet, const VectorBase< BaseFloat > &priors, const MatrixBase< BaseFloat > &feats, CachingOptimizingCompiler *compiler, const VectorBase< BaseFloat > *ivector=NULL, const MatrixBase< BaseFloat > *online_ivectors=NULL, int32 online_ivector_period=1)
 This constructor takes features as input, and you can either supply a single iVector input, estimated in batch-mode ('ivector'), or 'online' iVectors ('online_ivectors' and 'online_ivector_period', or none at all. More...
 
int32 NumFrames () const
 
int32 OutputDim () const
 
void GetOutputForFrame (int32 frame, VectorBase< BaseFloat > *output)
 
BaseFloat GetOutput (int32 subsampled_frame, int32 pdf_id)
 

Private Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (DecodableNnetSimple)
 
void EnsureFrameIsComputed (int32 subsampled_frame)
 
void DoNnetComputation (int32 input_t_start, const MatrixBase< BaseFloat > &input_feats, const VectorBase< BaseFloat > &ivector, int32 output_t_start, int32 num_subsampled_frames)
 
void GetCurrentIvector (int32 output_t_start, int32 num_output_frames, Vector< BaseFloat > *ivector)
 
void CheckAndFixConfigs ()
 
int32 GetIvectorDim () const
 

Private Attributes

NnetSimpleComputationOptions opts_
 
const Nnetnnet_
 
int32 nnet_left_context_
 
int32 nnet_right_context_
 
int32 output_dim_
 
CuVector< BaseFloatlog_priors_
 
const MatrixBase< BaseFloat > & feats_
 
int32 num_subsampled_frames_
 
const VectorBase< BaseFloat > * ivector_
 
const MatrixBase< BaseFloat > * online_ivector_feats_
 
int32 online_ivector_period_
 
CachingOptimizingCompilercompiler_
 
Matrix< BaseFloatcurrent_log_post_
 
int32 current_log_post_subsampled_offset_
 

Detailed Description

Definition at line 148 of file nnet-am-decodable-simple.h.

Constructor & Destructor Documentation

◆ DecodableNnetSimple()

DecodableNnetSimple ( const NnetSimpleComputationOptions opts,
const Nnet nnet,
const VectorBase< BaseFloat > &  priors,
const MatrixBase< BaseFloat > &  feats,
CachingOptimizingCompiler compiler,
const VectorBase< BaseFloat > *  ivector = NULL,
const MatrixBase< BaseFloat > *  online_ivectors = NULL,
int32  online_ivector_period = 1 
)

This constructor takes features as input, and you can either supply a single iVector input, estimated in batch-mode ('ivector'), or 'online' iVectors ('online_ivectors' and 'online_ivector_period', or none at all.

Note: it stores references to all arguments to the constructor, so don't delete them till this goes out of scope.

Parameters
[in]optsThe options class. Warning: it includes an acoustic weight, whose default is 0.1; you may sometimes want to change this to 1.0.
[in]nnetThe neural net that we're going to do the computation with
[in]priorsVector of priors– if supplied and nonempty, we subtract the log of these priors from the nnet output.
[in]featsThe input feature matrix.
[in]compilerA pointer to the compiler object to use– this enables the user to maintain a common object in the calling code that will cache computations across decodes. Note: the compiler code has no locking mechanism (and it would be tricky to design one, as we'd need to lock the individual computations also), so the calling code has to make sure that if there are multiple threads, they do not share the same compiler object.
[in]ivectorIf you are using iVectors estimated in batch mode, a pointer to the iVector, else NULL.
[in]online_ivectorsIf you are using iVectors estimated 'online' a pointer to the iVectors, else NULL.
[in]online_ivector_periodIf you are using iVectors estimated 'online' (i.e. if online_ivectors != NULL) gives the periodicity (in frames) with which the iVectors are estimated.

Definition at line 27 of file nnet-am-decodable-simple.cc.

References DecodableNnetSimple::CheckAndFixConfigs(), DecodableNnetSimple::compiler_, DecodableNnetSimple::feats_, NnetSimpleComputationOptions::frame_subsampling_factor, CachingOptimizingCompiler::GetSimpleNnetContext(), kaldi::nnet3::IsSimpleNnet(), KALDI_ASSERT, DecodableNnetSimple::log_priors_, DecodableNnetSimple::nnet_left_context_, DecodableNnetSimple::nnet_right_context_, DecodableNnetSimple::num_subsampled_frames_, and DecodableNnetSimple::opts_.

35  :
36  opts_(opts),
37  nnet_(nnet),
38  output_dim_(nnet_.OutputDim("output")),
39  log_priors_(priors),
40  feats_(feats),
41  ivector_(ivector), online_ivector_feats_(online_ivectors),
42  online_ivector_period_(online_ivector_period),
43  compiler_(*compiler),
46  (feats_.NumRows() + opts_.frame_subsampling_factor - 1) /
50  KALDI_ASSERT(!(ivector != NULL && online_ivectors != NULL));
51  KALDI_ASSERT(!(online_ivectors != NULL && online_ivector_period <= 0 &&
52  "You need to set the --online-ivector-period option!"));
53  log_priors_.ApplyLog();
55 }
const MatrixBase< BaseFloat > * online_ivector_feats_
int32 OutputDim(const std::string &output_name) const
Definition: nnet-nnet.cc:677
const MatrixBase< BaseFloat > & feats_
const VectorBase< BaseFloat > * ivector_
void GetSimpleNnetContext(int32 *nnet_left_context, int32 *nnet_right_context)
bool IsSimpleNnet(const Nnet &nnet)
This function returns true if the nnet has the following properties: It has an output called "output"...
Definition: nnet-utils.cc:52
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

Member Function Documentation

◆ CheckAndFixConfigs()

void CheckAndFixConfigs ( )
private

Definition at line 278 of file nnet-am-decodable-simple.cc.

References KALDI_ASSERT, KALDI_ERR, KALDI_LOG, kaldi::Lcm(), and rnnlm::n.

Referenced by DecodableNnetSimple::DecodableNnetSimple().

278  {
279  static bool warned_frames_per_chunk = false;
280  int32 nnet_modulus = nnet_.Modulus();
283  KALDI_ERR << "--frame-subsampling-factor and --frames-per-chunk must be > 0";
284  KALDI_ASSERT(nnet_modulus > 0);
285  int32 n = Lcm(opts_.frame_subsampling_factor, nnet_modulus);
286 
287  if (opts_.frames_per_chunk % n != 0) {
288  // round up to the nearest multiple of n.
289  int32 frames_per_chunk = n * ((opts_.frames_per_chunk + n - 1) / n);
290  if (!warned_frames_per_chunk) {
291  warned_frames_per_chunk = true;
292  if (nnet_modulus == 1) {
293  // simpler error message.
294  KALDI_LOG << "Increasing --frames-per-chunk from "
295  << opts_.frames_per_chunk << " to "
296  << frames_per_chunk << " to make it a multiple of "
297  << "--frame-subsampling-factor="
299  } else {
300  KALDI_LOG << "Increasing --frames-per-chunk from "
301  << opts_.frames_per_chunk << " to "
302  << frames_per_chunk << " due to "
303  << "--frame-subsampling-factor="
304  << opts_.frame_subsampling_factor << " and "
305  << "nnet shift-invariance modulus = " << nnet_modulus;
306  }
307  }
308  opts_.frames_per_chunk = frames_per_chunk;
309  }
310 }
kaldi::int32 int32
I Lcm(I m, I n)
Returns the least common multiple of two integers.
Definition: kaldi-math.h:318
int32 Modulus() const
[Relevant for clockwork RNNs and similar].
Definition: nnet-nnet.cc:658
struct rnnlm::@11::@12 n
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_LOG
Definition: kaldi-error.h:153

◆ DoNnetComputation()

void DoNnetComputation ( int32  input_t_start,
const MatrixBase< BaseFloat > &  input_feats,
const VectorBase< BaseFloat > &  ivector,
int32  output_t_start,
int32  num_subsampled_frames 
)
private

Definition at line 215 of file nnet-am-decodable-simple.cc.

References NnetComputer::AcceptInput(), CuMatrixBase< Real >::AddVecToRows(), CachingOptimizingCompiler::Compile(), DecodableAmNnetSimple::compiler_, VectorBase< Real >::Dim(), NnetComputer::GetOutputDestructive(), IoSpecification::has_deriv, rnnlm::i, IoSpecification::indexes, ComputationRequest::inputs, IoSpecification::name, ComputationRequest::need_model_derivative, MatrixBase< Real >::NumRows(), ComputationRequest::outputs, CuMatrix< Real >::Resize(), CuMatrixBase< Real >::Row(), NnetComputer::Run(), CuMatrixBase< Real >::Scale(), ComputationRequest::store_component_stats, and CuMatrix< Real >::Swap().

220  {
221  ComputationRequest request;
222  request.need_model_derivative = false;
223  request.store_component_stats = false;
224 
225  bool shift_time = true; // shift the 'input' and 'output' to a consistent
226  // time, to take advantage of caching in the compiler.
227  // An optimization.
228  int32 time_offset = (shift_time ? -output_t_start : 0);
229 
230  // First add the regular features-- named "input".
231  request.inputs.reserve(2);
232  request.inputs.push_back(
233  IoSpecification("input", time_offset + input_t_start,
234  time_offset + input_t_start + input_feats.NumRows()));
235  if (ivector.Dim() != 0) {
236  std::vector<Index> indexes;
237  indexes.push_back(Index(0, 0, 0));
238  request.inputs.push_back(IoSpecification("ivector", indexes));
239  }
240  IoSpecification output_spec;
241  output_spec.name = "output";
242  output_spec.has_deriv = false;
244  output_spec.indexes.resize(num_subsampled_frames);
245  // leave n and x values at 0 (the constructor sets these).
246  for (int32 i = 0; i < num_subsampled_frames; i++)
247  output_spec.indexes[i].t = time_offset + output_t_start + i * subsample;
248  request.outputs.resize(1);
249  request.outputs[0].Swap(&output_spec);
250 
251  std::shared_ptr<const NnetComputation> computation = compiler_.Compile(request);
252  Nnet *nnet_to_update = NULL; // we're not doing any update.
253  NnetComputer computer(opts_.compute_config, *computation,
254  nnet_, nnet_to_update);
255 
256  CuMatrix<BaseFloat> input_feats_cu(input_feats);
257  computer.AcceptInput("input", &input_feats_cu);
258  CuMatrix<BaseFloat> ivector_feats_cu;
259  if (ivector.Dim() > 0) {
260  ivector_feats_cu.Resize(1, ivector.Dim());
261  ivector_feats_cu.Row(0).CopyFromVec(ivector);
262  computer.AcceptInput("ivector", &ivector_feats_cu);
263  }
264  computer.Run();
265  CuMatrix<BaseFloat> cu_output;
266  computer.GetOutputDestructive("output", &cu_output);
267  // subtract log-prior (divide by prior)
268  if (log_priors_.Dim() != 0)
269  cu_output.AddVecToRows(-1.0, log_priors_);
270  // apply the acoustic scale
271  cu_output.Scale(opts_.acoustic_scale);
273  // the following statement just swaps the pointers if we're not using a GPU.
274  cu_output.Swap(&current_log_post_);
275  current_log_post_subsampled_offset_ = output_t_start / subsample;
276 }
kaldi::int32 int32
std::shared_ptr< const NnetComputation > Compile(const ComputationRequest &request)
Does the compilation and returns a const pointer to the result, which is owned by this class...
void Resize(const MatrixIndexT r, const MatrixIndexT c, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Sets matrix to a specified size (zero is OK as long as both r and c are zero).

◆ EnsureFrameIsComputed()

void EnsureFrameIsComputed ( int32  subsampled_frame)
private

Definition at line 93 of file nnet-am-decodable-simple.cc.

References VectorBase< Real >::CopyFromVec(), rnnlm::i, KALDI_ASSERT, KALDI_ERR, and MatrixBase< Real >::NumRows().

93  {
94  KALDI_ASSERT(subsampled_frame >= 0 &&
95  subsampled_frame < num_subsampled_frames_);
96  int32 feature_dim = feats_.NumCols(),
97  ivector_dim = GetIvectorDim(),
98  nnet_input_dim = nnet_.InputDim("input"),
99  nnet_ivector_dim = std::max<int32>(0, nnet_.InputDim("ivector"));
100  if (feature_dim != nnet_input_dim)
101  KALDI_ERR << "Neural net expects 'input' features with dimension "
102  << nnet_input_dim << " but you provided "
103  << feature_dim;
104  if (ivector_dim != std::max<int32>(0, nnet_.InputDim("ivector")))
105  KALDI_ERR << "Neural net expects 'ivector' features with dimension "
106  << nnet_ivector_dim << " but you provided " << ivector_dim;
107 
108  int32 current_subsampled_frames_computed = current_log_post_.NumRows(),
109  current_subsampled_offset = current_log_post_subsampled_offset_;
110  KALDI_ASSERT(subsampled_frame < current_subsampled_offset ||
111  subsampled_frame >= current_subsampled_offset +
112  current_subsampled_frames_computed);
113 
114  // all subsampled frames pertain to the output of the network,
115  // they are output frames divided by opts_.frame_subsampling_factor.
116  int32 subsampling_factor = opts_.frame_subsampling_factor,
117  subsampled_frames_per_chunk = opts_.frames_per_chunk / subsampling_factor,
118  start_subsampled_frame = subsampled_frame,
119  num_subsampled_frames = std::min<int32>(num_subsampled_frames_ -
120  start_subsampled_frame,
121  subsampled_frames_per_chunk),
122  last_subsampled_frame = start_subsampled_frame + num_subsampled_frames - 1;
123  KALDI_ASSERT(num_subsampled_frames > 0);
124  // the output-frame numbers are the subsampled-frame numbers
125  int32 first_output_frame = start_subsampled_frame * subsampling_factor,
126  last_output_frame = last_subsampled_frame * subsampling_factor;
127 
129  int32 extra_left_context = opts_.extra_left_context,
130  extra_right_context = opts_.extra_right_context;
131  if (first_output_frame == 0 && opts_.extra_left_context_initial >= 0)
132  extra_left_context = opts_.extra_left_context_initial;
133  if (last_subsampled_frame == num_subsampled_frames_ - 1 &&
135  extra_right_context = opts_.extra_right_context_final;
136  int32 left_context = nnet_left_context_ + extra_left_context,
137  right_context = nnet_right_context_ + extra_right_context;
138  int32 first_input_frame = first_output_frame - left_context,
139  last_input_frame = last_output_frame + right_context,
140  num_input_frames = last_input_frame + 1 - first_input_frame;
141  Vector<BaseFloat> ivector;
142  GetCurrentIvector(first_output_frame,
143  last_output_frame - first_output_frame,
144  &ivector);
145 
146  Matrix<BaseFloat> input_feats;
147  if (first_input_frame >= 0 &&
148  last_input_frame < feats_.NumRows()) {
149  SubMatrix<BaseFloat> input_feats(feats_.RowRange(first_input_frame,
150  num_input_frames));
151  DoNnetComputation(first_input_frame, input_feats, ivector,
152  first_output_frame, num_subsampled_frames);
153  } else {
154  Matrix<BaseFloat> feats_block(num_input_frames, feats_.NumCols());
155  int32 tot_input_feats = feats_.NumRows();
156  for (int32 i = 0; i < num_input_frames; i++) {
157  SubVector<BaseFloat> dest(feats_block, i);
158  int32 t = i + first_input_frame;
159  if (t < 0) t = 0;
160  if (t >= tot_input_feats) t = tot_input_feats - 1;
161  const SubVector<BaseFloat> src(feats_, t);
162  dest.CopyFromVec(src);
163  }
164  DoNnetComputation(first_input_frame, feats_block, ivector,
165  first_output_frame, num_subsampled_frames);
166  }
167 }
int32 InputDim(const std::string &input_name) const
Definition: nnet-nnet.cc:669
kaldi::int32 int32
const MatrixBase< BaseFloat > & feats_
void GetCurrentIvector(int32 output_t_start, int32 num_output_frames, Vector< BaseFloat > *ivector)
void DoNnetComputation(int32 input_t_start, const MatrixBase< BaseFloat > &input_feats, const VectorBase< BaseFloat > &ivector, int32 output_t_start, int32 num_subsampled_frames)
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64

◆ GetCurrentIvector()

void GetCurrentIvector ( int32  output_t_start,
int32  num_output_frames,
Vector< BaseFloat > *  ivector 
)
private

Definition at line 181 of file nnet-am-decodable-simple.cc.

References KALDI_ASSERT, and KALDI_ERR.

183  {
184  if (ivector_ != NULL) {
185  *ivector = *ivector_;
186  return;
187  } else if (online_ivector_feats_ == NULL) {
188  return;
189  }
191  // frame_to_search is the frame that we want to get the most recent iVector
192  // for. We choose a point near the middle of the current window, the concept
193  // being that this is the fairest comparison to nnet2. Obviously we could do
194  // better by always taking the last frame's iVector, but decoding with
195  // 'online' ivectors is only really a mechanism to simulate online operation.
196  int32 frame_to_search = output_t_start + num_output_frames / 2;
197  int32 ivector_frame = frame_to_search / online_ivector_period_;
198  KALDI_ASSERT(ivector_frame >= 0);
199  if (ivector_frame >= online_ivector_feats_->NumRows()) {
200  int32 margin = ivector_frame - (online_ivector_feats_->NumRows() - 1);
201  if (margin * online_ivector_period_ > 50) {
202  // Half a second seems like too long to be explainable as edge effects.
203  KALDI_ERR << "Could not get iVector for frame " << frame_to_search
204  << ", only available till frame "
205  << online_ivector_feats_->NumRows()
206  << " * ivector-period=" << online_ivector_period_
207  << " (mismatched --online-ivector-period?)";
208  }
209  ivector_frame = online_ivector_feats_->NumRows() - 1;
210  }
211  *ivector = online_ivector_feats_->Row(ivector_frame);
212 }
const MatrixBase< BaseFloat > * online_ivector_feats_
kaldi::int32 int32
const VectorBase< BaseFloat > * ivector_
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ GetIvectorDim()

int32 GetIvectorDim ( ) const
private

Definition at line 84 of file nnet-am-decodable-simple.cc.

84  {
85  if (ivector_ != NULL)
86  return ivector_->Dim();
87  else if (online_ivector_feats_ != NULL)
88  return online_ivector_feats_->NumCols();
89  else
90  return 0;
91 }
const MatrixBase< BaseFloat > * online_ivector_feats_
const VectorBase< BaseFloat > * ivector_

◆ GetOutput()

BaseFloat GetOutput ( int32  subsampled_frame,
int32  pdf_id 
)
inline

Definition at line 205 of file nnet-am-decodable-simple.h.

References NnetSimpleComputationOptions::CheckAndFixConfigs(), and KALDI_DISALLOW_COPY_AND_ASSIGN.

Referenced by DecodableAmNnetSimple::LogLikelihood(), and DecodableAmNnetSimpleParallel::LogLikelihood().

205  {
206  if (subsampled_frame < current_log_post_subsampled_offset_ ||
207  subsampled_frame >= current_log_post_subsampled_offset_ +
209  EnsureFrameIsComputed(subsampled_frame);
210  return current_log_post_(subsampled_frame -
212  pdf_id);
213  }
void EnsureFrameIsComputed(int32 subsampled_frame)
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64

◆ GetOutputForFrame()

void GetOutputForFrame ( int32  frame,
VectorBase< BaseFloat > *  output 
)

Definition at line 171 of file nnet-am-decodable-simple.cc.

References VectorBase< Real >::CopyFromVec().

Referenced by main(), and kaldi::nnet3::TestNnetDecodable().

172  {
173  if (subsampled_frame < current_log_post_subsampled_offset_ ||
174  subsampled_frame >= current_log_post_subsampled_offset_ +
176  EnsureFrameIsComputed(subsampled_frame);
177  output->CopyFromVec(current_log_post_.Row(
178  subsampled_frame - current_log_post_subsampled_offset_));
179 }
const SubVector< Real > Row(MatrixIndexT i) const
Return specific row of matrix [const].
Definition: kaldi-matrix.h:188
void EnsureFrameIsComputed(int32 subsampled_frame)
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( DecodableNnetSimple  )
private

◆ NumFrames()

int32 NumFrames ( ) const
inline

Definition at line 194 of file nnet-am-decodable-simple.h.

Referenced by main().

◆ OutputDim()

int32 OutputDim ( ) const
inline

Definition at line 196 of file nnet-am-decodable-simple.h.

Referenced by main().

Member Data Documentation

◆ compiler_

CachingOptimizingCompiler& compiler_
private

◆ current_log_post_

Matrix<BaseFloat> current_log_post_
private

Definition at line 275 of file nnet-am-decodable-simple.h.

◆ current_log_post_subsampled_offset_

int32 current_log_post_subsampled_offset_
private

Definition at line 279 of file nnet-am-decodable-simple.h.

◆ feats_

const MatrixBase<BaseFloat>& feats_
private

◆ ivector_

const VectorBase<BaseFloat>* ivector_
private

Definition at line 260 of file nnet-am-decodable-simple.h.

◆ log_priors_

CuVector<BaseFloat> log_priors_
private

◆ nnet_

const Nnet& nnet_
private

Definition at line 247 of file nnet-am-decodable-simple.h.

◆ nnet_left_context_

int32 nnet_left_context_
private

◆ nnet_right_context_

int32 nnet_right_context_
private

◆ num_subsampled_frames_

int32 num_subsampled_frames_
private

◆ online_ivector_feats_

const MatrixBase<BaseFloat>* online_ivector_feats_
private

Definition at line 263 of file nnet-am-decodable-simple.h.

◆ online_ivector_period_

int32 online_ivector_period_
private

Definition at line 266 of file nnet-am-decodable-simple.h.

◆ opts_

◆ output_dim_

int32 output_dim_
private

Definition at line 250 of file nnet-am-decodable-simple.h.


The documentation for this class was generated from the following files: