Profiling¶
Table of Contents
The Python community has written some excellent profiling tools. This page includes some suggestions about using them to profile your code.
Using line_profile
¶
Begin by instantiating a profiler:
>>> import line_profiler
>>> ip = line_profiler.LineProfiler()
You must instruct this profiler to collect statistics on particular functions:
>>> import pydfmux
>>> ip.add_function(pydfmux.algorithms.measurement_algs.take_netanal.take_netanal)
>>> import dfmux_mezzanine_qc
>>> results = lp.run('dfmux_mezzanine_qc.main()')
Your profiling results are now available via the results
object:
>>> results.print_stats()
This call produces a large amount of output, including the following:
File: /home/gsmecher/winterland/pydfmux/algorithms/measurement_algs/take_netanal.py
Function: sweep_netanal at line 185
Line # Hits Time Per Hit % Time Line Contents
==============================================================
185 def sweep_netanal(rmod, freqmin, freqmax, npts, nsamps,
186 amp, target, logger=False):
187 """The worker function of the take_netanal.py module.
188
189 This function actually sets the frequencies and amplitudes, records the
190 data, calculates the phase and magnitude information, and outputs this to a
191 dictionary. It is hidden from tab-completion as it should not be called on
192 its own. Use the take_netanal function instead.
193
194 Args:
195 rmod: A ReadoutModule hardware-map object with which to take a network
196 analysis.
197
198 freqmin: The starting frequency of the scan (in Hz). Default: 2e5
199
200 freqmax: The ending frequency of the scan (in Hz). Default: 5e6
201
202 npts: The number of points to take in the scan. NOTE -- this will be
203 rounded to the next multiple of the number of channels being used
204 (typically 64, but the code queries the board to ask what
205 mux-factor it can do). Default: 500
206
207 nsamps: The number of samples to take at each frequency (how long to
208 integrate). Default: 128
209
210 amp: Amplitude in Normalized units for each frequency to use. Remember
211 that in general there will be 64 such frequencies simultaneously,
212 so divide up that dynamic range accordingly. Default: 0.01
213
214 target: The DAC to use in the scan. Options are 'nuller' and 'carrier'
215 (also accessible as dfmux attributes as d.TARGET.NULLER, d.TARGET.CARRIER)
216
217 logger: Standard logging object being used.
218 """
219 8 79 9.9 0.0 d = rmod.iceboard
220 8 227 28.4 0.0 rail_monitor = RailMonitor(rmod, logger=logger) # This is a class instantiation of the rail monitor
221 8 18 2.2 0.0 if not logger:
222 logger = _LM.get_child_logger(target=rmod, extra_field_dict={})
223 logger.info('STARTING Algorithm')
224
225 8 30 3.8 0.0 if target == d.TARGET.NULLER:
226 4 462 115.5 0.0 logger.info('Taking nuller network analysis.')
227 4 19 4.8 0.0 internal_target = d.ROUTING.NULLER
228 else:
229 4 612 153.0 0.0 logger.info('Taking carrier network analysis.')
230 4 63 15.8 0.0 internal_target = d.ROUTING.CARRIER
231
232 8 218829 27353.6 0.1 chanlist = rmod.channels.channel
233
234 8 24 3.0 0.0 if npts % len(chanlist):
235 8 18 2.2 0.0 npts += len(chanlist) - npts % len(chanlist)
236 8 1056 132.0 0.0 logger.info("Selected number of points doesn't evenly divide the available channels. Rounding up to {0} points".format(npts))
237
238 8 420 52.5 0.0 freqlist = np.linspace(freqmin, freqmax, npts)
239 8 170 21.2 0.0 freqlist = np.around(freqlist, decimals=1) # Just making the output easier
240 8 20 2.5 0.0 niters = npts / len(chanlist)
241 8 14 1.8 0.0 maglist = []
242 8 12 1.5 0.0 phaselist = []
243 8 11 1.4 0.0 noiselist = []
244 8 12 1.5 0.0 vdemodlist = []
245 8 12 1.5 0.0 amps_raw = []
246 8 12 1.5 0.0 amps_cal = []
247 8 12 1.5 0.0 overloadcountlist = []
248
249 8 4520884 565110.5 1.6 rmod.channels.set_amplitude(amp, d.UNITS.NORMALIZED, target)
250
251 72 189 2.6 0.0 for iiter in range(niters):
252 64 8432 131.8 0.0 logger.info('Taking frequency set {0} of {1}'.format(iiter + 1, niters))
253
254 # Resetting the DMFD routing, for safety
255 64 865 13.5 0.0 rmod.set_dmfd_routing(target=d.ROUTING.ADC, target_module=rmod.module,
256 64 389958 6093.1 0.1 target_mezzanine=rmod.mezzanine.mezzanine)
257 4160 321058 77.2 0.1 for channel in rmod.channels:
258 4096 32020 7.8 0.0 channel.set_frequency(freqlist[(iiter * 64) + channel.channel - 1],
259 4096 21271228 5193.2 7.7 d.UNITS.HZ, d.TARGET.DEMOD)
260 4096 33903 8.3 0.0 channel.set_frequency(freqlist[(iiter * 64) + channel.channel - 1],
261 4096 21723862 5303.7 7.9 d.UNITS.HZ, target)
262
263 64 5116163 79940.0 1.9 rail_monitor.check_DAC_rail(target=target)
264 64 4448112 69501.8 1.6 rail_monitor.check_ADC_rail()
265 64 9319 145.6 0.0 logger.debug('Getting Samples')
266 64 87460093 1366564.0 31.7 samples = rmod.get_samples(num_samples=nsamps)
267 64 8825 137.9 0.0 logger.debug('Getting Calibration Rerouted Samples')
268 64 750 11.7 0.0 rmod.set_dmfd_routing(target=internal_target, target_module=rmod.module,
269 64 605957 9468.1 0.2 target_mezzanine=rmod.mezzanine.mezzanine)
270 64 4263775 66621.5 1.5 rail_monitor.check_ADC_rail()
271 64 88379403 1380928.2 32.0 cal_samples = rmod.get_samples(num_samples=nsamps)
272
273 # Calculate the quantities we want as we go
274 4160 366895 88.2 0.1 for chan in rmod.channels:
275 4096 21363718 5215.8 7.7 chanfreq = chan.get_frequency(d.UNITS.HZ, d.TARGET.NULLER)
276 4096 84324 20.6 0.0 pfb_cor = 1.0 / pfb_gains.predict_amplitude_factor(chanfreq)
277 4096 137108 33.5 0.0 complexData = (pfb_cor * np.array(samples.i[chan.channel - 1]).astype('float64')) + \
278 4096 123754 30.2 0.0 (pfb_cor * np.array(samples.q[chan.channel - 1]).astype('float64') * 1.j)
279 4096 84030 20.5 0.0 complexCalData = (pfb_cor * np.array(cal_samples.i[chan.channel - 1]).astype('float64')) + \
280 4096 100469 24.5 0.0 (pfb_cor * np.array(cal_samples.q[chan.channel - 1]).astype('float64') * 1.j)
281 4096 6633 1.6 0.0 if amp == 0.:
282 complexCalData = complexData * 0. + 1.
283
284 4096 22743 5.6 0.0 fullData = complexData / complexCalData
285 4096 139277 34.0 0.1 meanData = np.mean(fullData)
286 4096 7907 1.9 0.0 mag = abs(meanData)
287 4096 29091 7.1 0.0 phase = np.arctan2(meanData.imag, meanData.real) * 57.2957795 # in degrees
288 4096 12232 3.0 0.0 phase %= 360 # ensure 0 <= phase < 360
289
290 4096 6775 1.7 0.0 maglist.append(mag)
291 4096 5494 1.3 0.0 phaselist.append(phase)
292 4096 236471 57.7 0.1 noiselist.append(np.std(abs(fullData)) / np.sqrt(nsamps))
293 4096 76083 18.6 0.0 raw = np.mean(complexData)
294 4096 61301 15.0 0.0 cal = np.mean(complexCalData)
295 4096 6409 1.6 0.0 amps_raw.append(raw)
296 4096 5776 1.4 0.0 amps_cal.append(cal)
297 # amps_raw.append(np.mean(complexData))
298 # amps_cal.append(np.mean(complexCalData))
299 64 293283 4582.5 0.1 overloadcountlist.append(rmod.get_overload_count(d.TARGET.DEMOD))
300 64 335 5.2 0.0 vdemodlist.append(rmod.get_fast_samples(units=d.UNITS.NORMALIZED,
301 64 93 1.5 0.0 average=True,
302 64 5987434 93553.7 2.2 target=d.TARGET.DEMOD))
303
304 8 1128 141.0 0.0 logger.debug('Finishing, Restoring re-routing, re-zeroing channels')
305
306 8 47077 5884.6 0.0 rmod.set_dmfd_routing(target=d.ROUTING.ADC, target_module=rmod.module, target_mezzanine=rmod.mezzanine.mezzanine)
307 8 2790021 348752.6 1.0 rmod.channels.set_amplitude(0, d.UNITS.NORMALIZED, target)
308 8 2797171 349646.4 1.0 rmod.channels.set_frequency(0, d.UNITS.HZ, target)
309 8 2678879 334859.9 1.0 rmod.channels.set_frequency(0, d.UNITS.HZ, d.TARGET.DEMOD)
310
311 8 30 3.8 0.0 data = {
312 8 20 2.5 0.0 'description': 'Network Analysis using the {0}'.format(target),
313 8 13 1.6 0.0 'freq': freqlist,
314 8 12 1.5 0.0 'amp': maglist,
315 8 14 1.8 0.0 'phase': phaselist,
316 8 9 1.1 0.0 'noise': noiselist,
317 8 10 1.2 0.0 'vdemod': vdemodlist,
318 8 9 1.1 0.0 'overloadcount': overloadcountlist,
319 8 11 1.4 0.0 'amp_raw': amps_raw,
320 8 10 1.2 0.0 'amp_cal': amps_cal}
321
322 8 11 1.4 0.0 return data
This output immediately tells us that get_samples()
calls are responsible
for 63% of the script’s runtime. The remaining calls are possibly worth
optimizing, but Amdahl’s Law
tells us we’ll never be able to halve the runtime unless we tackle
get_samples()
.