Reshape numpy array with minimum rate
I have an array that is not monotonic increasing. I would like to make it monotonic increasing applying a constant rate when the array decreases.
I have create a small example here where the rate is 0.2:
# Rate
rate = 0.2
# Array to interpolate
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
# Line with constant rate at first monotonic decrease (index 6)
xx1 = 6
xr1 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr1 = rate*xr1 + (arr1[xx1]-rate*xx1)
# Line with constant rate at second monotonic decrease [index 14]
xx2 = 13
xr2 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr2 = rate*xr2 + (arr1[xx2]-rate*xx2)
# Line with constant rate at second monotonic decrease [index 14]
xx3 = 20
xr3 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr3 = rate*xr3 + (arr1[xx3]-rate*xx3)
plt.figure()
plt.plot(arr1,'.-',label='Original')
plt.plot(xr1,yr1,label='Const Rate line 1')
plt.plot(xr2,yr2,label='Const Rate line 2')
plt.plot(xr3,yr3,label='Const Rate line 2')
plt.legend()
plt.grid()
The "Original" array is my dataset.
The final results I would like is the blue + red-dashed line. In the figure I highlighted also the "constant rate curves".
Since I have very large arrays (millions of records), I would like to avoid for-loops over the entire array.
Thanks a lot to everybody for the help!
python arrays numpy reshape rate
|
show 1 more comment
I have an array that is not monotonic increasing. I would like to make it monotonic increasing applying a constant rate when the array decreases.
I have create a small example here where the rate is 0.2:
# Rate
rate = 0.2
# Array to interpolate
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
# Line with constant rate at first monotonic decrease (index 6)
xx1 = 6
xr1 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr1 = rate*xr1 + (arr1[xx1]-rate*xx1)
# Line with constant rate at second monotonic decrease [index 14]
xx2 = 13
xr2 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr2 = rate*xr2 + (arr1[xx2]-rate*xx2)
# Line with constant rate at second monotonic decrease [index 14]
xx3 = 20
xr3 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr3 = rate*xr3 + (arr1[xx3]-rate*xx3)
plt.figure()
plt.plot(arr1,'.-',label='Original')
plt.plot(xr1,yr1,label='Const Rate line 1')
plt.plot(xr2,yr2,label='Const Rate line 2')
plt.plot(xr3,yr3,label='Const Rate line 2')
plt.legend()
plt.grid()
The "Original" array is my dataset.
The final results I would like is the blue + red-dashed line. In the figure I highlighted also the "constant rate curves".
Since I have very large arrays (millions of records), I would like to avoid for-loops over the entire array.
Thanks a lot to everybody for the help!
python arrays numpy reshape rate
Shift a copy of the array by one sample and subtract that from the original to find where one sample is less than the next.
– Mark Setchell
Nov 15 '18 at 9:10
You should change "reshape" to "interpolate", as "reshaping" is something entirely unrelated to this.
– Nils Werner
Nov 15 '18 at 10:23
What if rate = 0.3? Thenarr2 = np.array([0, 1, 2, 3, 4, 4, 4, 4.3, 4.6, 4.9, 5.2, 5, ...])
. As you can see it is not monotonically increasing. So you need to check this... Why not just skip the unwanted points and connect the successive increasing points with a line, e.g.arr1[6] = 4
witharr1[11] = 5
,arr1[13] = 10
witharr1[18] = 11.2
?
– AndyK
Nov 15 '18 at 10:31
@AndyK I con connect the points as you suggest, but I need to find the a connecting point is above a line starting from the first point and with a constant rate (I'll update the question to be clearer).
– Giuseppe Salerno
Nov 15 '18 at 11:06
Why would your expected output to interpolate betweenx[20]
andx[29]
, whenx[26] = x[20]
andx[27] > x[20]
? Do you expectx[n] > x[n-1] + 0.2
?
– Nils Werner
Nov 15 '18 at 12:44
|
show 1 more comment
I have an array that is not monotonic increasing. I would like to make it monotonic increasing applying a constant rate when the array decreases.
I have create a small example here where the rate is 0.2:
# Rate
rate = 0.2
# Array to interpolate
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
# Line with constant rate at first monotonic decrease (index 6)
xx1 = 6
xr1 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr1 = rate*xr1 + (arr1[xx1]-rate*xx1)
# Line with constant rate at second monotonic decrease [index 14]
xx2 = 13
xr2 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr2 = rate*xr2 + (arr1[xx2]-rate*xx2)
# Line with constant rate at second monotonic decrease [index 14]
xx3 = 20
xr3 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr3 = rate*xr3 + (arr1[xx3]-rate*xx3)
plt.figure()
plt.plot(arr1,'.-',label='Original')
plt.plot(xr1,yr1,label='Const Rate line 1')
plt.plot(xr2,yr2,label='Const Rate line 2')
plt.plot(xr3,yr3,label='Const Rate line 2')
plt.legend()
plt.grid()
The "Original" array is my dataset.
The final results I would like is the blue + red-dashed line. In the figure I highlighted also the "constant rate curves".
Since I have very large arrays (millions of records), I would like to avoid for-loops over the entire array.
Thanks a lot to everybody for the help!
python arrays numpy reshape rate
I have an array that is not monotonic increasing. I would like to make it monotonic increasing applying a constant rate when the array decreases.
I have create a small example here where the rate is 0.2:
# Rate
rate = 0.2
# Array to interpolate
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
# Line with constant rate at first monotonic decrease (index 6)
xx1 = 6
xr1 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr1 = rate*xr1 + (arr1[xx1]-rate*xx1)
# Line with constant rate at second monotonic decrease [index 14]
xx2 = 13
xr2 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr2 = rate*xr2 + (arr1[xx2]-rate*xx2)
# Line with constant rate at second monotonic decrease [index 14]
xx3 = 20
xr3 = np.array(np.arange(0,arr1.shape[0]+1),dtype=float)
yr3 = rate*xr3 + (arr1[xx3]-rate*xx3)
plt.figure()
plt.plot(arr1,'.-',label='Original')
plt.plot(xr1,yr1,label='Const Rate line 1')
plt.plot(xr2,yr2,label='Const Rate line 2')
plt.plot(xr3,yr3,label='Const Rate line 2')
plt.legend()
plt.grid()
The "Original" array is my dataset.
The final results I would like is the blue + red-dashed line. In the figure I highlighted also the "constant rate curves".
Since I have very large arrays (millions of records), I would like to avoid for-loops over the entire array.
Thanks a lot to everybody for the help!
python arrays numpy reshape rate
python arrays numpy reshape rate
edited Nov 15 '18 at 12:28
Giuseppe Salerno
asked Nov 15 '18 at 8:58
Giuseppe SalernoGiuseppe Salerno
354
354
Shift a copy of the array by one sample and subtract that from the original to find where one sample is less than the next.
– Mark Setchell
Nov 15 '18 at 9:10
You should change "reshape" to "interpolate", as "reshaping" is something entirely unrelated to this.
– Nils Werner
Nov 15 '18 at 10:23
What if rate = 0.3? Thenarr2 = np.array([0, 1, 2, 3, 4, 4, 4, 4.3, 4.6, 4.9, 5.2, 5, ...])
. As you can see it is not monotonically increasing. So you need to check this... Why not just skip the unwanted points and connect the successive increasing points with a line, e.g.arr1[6] = 4
witharr1[11] = 5
,arr1[13] = 10
witharr1[18] = 11.2
?
– AndyK
Nov 15 '18 at 10:31
@AndyK I con connect the points as you suggest, but I need to find the a connecting point is above a line starting from the first point and with a constant rate (I'll update the question to be clearer).
– Giuseppe Salerno
Nov 15 '18 at 11:06
Why would your expected output to interpolate betweenx[20]
andx[29]
, whenx[26] = x[20]
andx[27] > x[20]
? Do you expectx[n] > x[n-1] + 0.2
?
– Nils Werner
Nov 15 '18 at 12:44
|
show 1 more comment
Shift a copy of the array by one sample and subtract that from the original to find where one sample is less than the next.
– Mark Setchell
Nov 15 '18 at 9:10
You should change "reshape" to "interpolate", as "reshaping" is something entirely unrelated to this.
– Nils Werner
Nov 15 '18 at 10:23
What if rate = 0.3? Thenarr2 = np.array([0, 1, 2, 3, 4, 4, 4, 4.3, 4.6, 4.9, 5.2, 5, ...])
. As you can see it is not monotonically increasing. So you need to check this... Why not just skip the unwanted points and connect the successive increasing points with a line, e.g.arr1[6] = 4
witharr1[11] = 5
,arr1[13] = 10
witharr1[18] = 11.2
?
– AndyK
Nov 15 '18 at 10:31
@AndyK I con connect the points as you suggest, but I need to find the a connecting point is above a line starting from the first point and with a constant rate (I'll update the question to be clearer).
– Giuseppe Salerno
Nov 15 '18 at 11:06
Why would your expected output to interpolate betweenx[20]
andx[29]
, whenx[26] = x[20]
andx[27] > x[20]
? Do you expectx[n] > x[n-1] + 0.2
?
– Nils Werner
Nov 15 '18 at 12:44
Shift a copy of the array by one sample and subtract that from the original to find where one sample is less than the next.
– Mark Setchell
Nov 15 '18 at 9:10
Shift a copy of the array by one sample and subtract that from the original to find where one sample is less than the next.
– Mark Setchell
Nov 15 '18 at 9:10
You should change "reshape" to "interpolate", as "reshaping" is something entirely unrelated to this.
– Nils Werner
Nov 15 '18 at 10:23
You should change "reshape" to "interpolate", as "reshaping" is something entirely unrelated to this.
– Nils Werner
Nov 15 '18 at 10:23
What if rate = 0.3? Then
arr2 = np.array([0, 1, 2, 3, 4, 4, 4, 4.3, 4.6, 4.9, 5.2, 5, ...])
. As you can see it is not monotonically increasing. So you need to check this... Why not just skip the unwanted points and connect the successive increasing points with a line, e.g. arr1[6] = 4
with arr1[11] = 5
, arr1[13] = 10
with arr1[18] = 11.2
?– AndyK
Nov 15 '18 at 10:31
What if rate = 0.3? Then
arr2 = np.array([0, 1, 2, 3, 4, 4, 4, 4.3, 4.6, 4.9, 5.2, 5, ...])
. As you can see it is not monotonically increasing. So you need to check this... Why not just skip the unwanted points and connect the successive increasing points with a line, e.g. arr1[6] = 4
with arr1[11] = 5
, arr1[13] = 10
with arr1[18] = 11.2
?– AndyK
Nov 15 '18 at 10:31
@AndyK I con connect the points as you suggest, but I need to find the a connecting point is above a line starting from the first point and with a constant rate (I'll update the question to be clearer).
– Giuseppe Salerno
Nov 15 '18 at 11:06
@AndyK I con connect the points as you suggest, but I need to find the a connecting point is above a line starting from the first point and with a constant rate (I'll update the question to be clearer).
– Giuseppe Salerno
Nov 15 '18 at 11:06
Why would your expected output to interpolate between
x[20]
and x[29]
, when x[26] = x[20]
and x[27] > x[20]
? Do you expect x[n] > x[n-1] + 0.2
?– Nils Werner
Nov 15 '18 at 12:44
Why would your expected output to interpolate between
x[20]
and x[29]
, when x[26] = x[20]
and x[27] > x[20]
? Do you expect x[n] > x[n-1] + 0.2
?– Nils Werner
Nov 15 '18 at 12:44
|
show 1 more comment
5 Answers
5
active
oldest
votes
Here's a different option: If you are interested in plotting monotonically increasing curve from your data, then you can simply skip the unwanted points between two successive increasing points, e.g. between arr1[6] = 4
and arr1[11] = 5
, by connecting them with a line.
import numpy as np
import matplotlib.pyplot as plt
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
mask = (arr1 == np.maximum.accumulate(np.nan_to_num(arr1)))
x = np.arange(len(arr1))
plt.figure()
plt.plot(x, arr1,'.-',label='Original')
plt.plot(x[mask], arr1[mask], 'r-', label='Interp.')
plt.legend()
plt.grid()
add a comment |
arr2 = arr1[1:] - arr1[:-1]
ind = numpy.where(arr2 < 0)[0]
for i in ind:
arr1[i] = arr1[i - 1] + rate
You may need to replace first any numpy.nan with values, such as numpy.amin(arr1)
add a comment |
I would like to avoid for-loops over the entire array.
Frankly speaking, it is hard to achieve no for-loops in numpy, because numpy as the C-made-library uses for-loops implemented in C / C++. And all sorting algorithm (like np.argwhere, np.all, etc.) requires comparisons and therefore also iterations.
Contrarily, I suggest using at least one explicit loop made in Python (iteration is made only once):
arr0 = np.zeros_like(arr1)
num = 1
rate = .2
while(num < len(arr1)):
if arr1[num] < arr1[num-1] or np.isnan(arr1[num]):
start = arr1[num-1]
while(start > arr1[num] or np.isnan(arr1[num])):
print(arr1[num])
arr0[num] = arr0[num-1] + rate
num+=1
continue
arr0[num] = arr1[num]
num +=1
add a comment |
Your problem can be expressed in one simple recursive difference equation:
y[n] = max(y[n-1] + 0.2, x[n])
So the direct Python form would be
def func(a):
out = np.zeros_like(a)
out[0] = a[0]
for i in range(1, len(a)):
out[i] = max(out[i-1] + 0.2, a[i])
return out
Unfortunately, this equation is recursive and non-linear, so finding a vectorized algorithm may be difficult.
However, using Numba we can speed up this loop-based algorithm by a factor of 300:
fastfunc = numba.jit(func)
arr1 = np.random.rand(1000000)
%timeit func(arr1)
# 599 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fastfunc(arr1)
# 2.22 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I think OP wants10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not10. , 10.24, 10.48, 10.72, 10.96, 11.2
.
– AndyK
Nov 15 '18 at 10:41
add a comment |
I finally managed to do what I wanted with a while loop.
# data['myvar'] is the original dataset I want to reshape
data['myvar_corrected'] = data['myvar'].values
temp_d = data['myvar'].fillna(0).values*1.0
dtc = np.maximum.accumulate(temp_d)
data.loc[temp_d < np.maximum.accumulate(dtc),'myvar_corrected'] = float('nan')
stay_in_while = True
min_rate = 5/200000/(24*60)
idx_next = 0
while stay_in_while:
df_temp = data.iloc[idx_next:]
if df_tem['myvar'].isnull().sum()>0:
idx_first_nan = df_temp.reset_index().['myvar_corrected'].isnull().argmax()
idx_nan_or = (data_new.index.values==df_temp.index.values[idx_first_nan]).argmax()
x = np.arange(idx_first_nan-1,df_temp.shape[0])
y0 = df_temp.iloc[idx_first_nan-1]['myvar_corrected']
rate_curve = min_rate*x + (y0 - min_rate*(idx_first_nan-1))
damage_m_rate = df_temp.iloc[idx_first_nan-1:]['myvar_corrected']-rate_curve
try:
idx_intercept = (data_new.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()
data_new.iloc[idx_nan_or:idx_intercept]['myvar'] = rate_curve[0:(damage_m_rate.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()-1]
idx_next = idx_intercept + 1
except:
stay_in_while = False
else:
stay_in_while = False
# Finally I have my result stored in data_new['myvar']
In the following picture the result.
Thanks to everybody for the contribution!
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53315664%2freshape-numpy-array-with-minimum-rate%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here's a different option: If you are interested in plotting monotonically increasing curve from your data, then you can simply skip the unwanted points between two successive increasing points, e.g. between arr1[6] = 4
and arr1[11] = 5
, by connecting them with a line.
import numpy as np
import matplotlib.pyplot as plt
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
mask = (arr1 == np.maximum.accumulate(np.nan_to_num(arr1)))
x = np.arange(len(arr1))
plt.figure()
plt.plot(x, arr1,'.-',label='Original')
plt.plot(x[mask], arr1[mask], 'r-', label='Interp.')
plt.legend()
plt.grid()
add a comment |
Here's a different option: If you are interested in plotting monotonically increasing curve from your data, then you can simply skip the unwanted points between two successive increasing points, e.g. between arr1[6] = 4
and arr1[11] = 5
, by connecting them with a line.
import numpy as np
import matplotlib.pyplot as plt
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
mask = (arr1 == np.maximum.accumulate(np.nan_to_num(arr1)))
x = np.arange(len(arr1))
plt.figure()
plt.plot(x, arr1,'.-',label='Original')
plt.plot(x[mask], arr1[mask], 'r-', label='Interp.')
plt.legend()
plt.grid()
add a comment |
Here's a different option: If you are interested in plotting monotonically increasing curve from your data, then you can simply skip the unwanted points between two successive increasing points, e.g. between arr1[6] = 4
and arr1[11] = 5
, by connecting them with a line.
import numpy as np
import matplotlib.pyplot as plt
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
mask = (arr1 == np.maximum.accumulate(np.nan_to_num(arr1)))
x = np.arange(len(arr1))
plt.figure()
plt.plot(x, arr1,'.-',label='Original')
plt.plot(x[mask], arr1[mask], 'r-', label='Interp.')
plt.legend()
plt.grid()
Here's a different option: If you are interested in plotting monotonically increasing curve from your data, then you can simply skip the unwanted points between two successive increasing points, e.g. between arr1[6] = 4
and arr1[11] = 5
, by connecting them with a line.
import numpy as np
import matplotlib.pyplot as plt
arr1 = np.array([0,1,2,3,4,4,4,3,2,2.5,3.5,5.2,7,10,9.5,np.nan,np.nan,np.nan,11.2, 11.4, 12,10,9,9.5,10.2,10.5,10.8,12,12.5,15],dtype=float)
mask = (arr1 == np.maximum.accumulate(np.nan_to_num(arr1)))
x = np.arange(len(arr1))
plt.figure()
plt.plot(x, arr1,'.-',label='Original')
plt.plot(x[mask], arr1[mask], 'r-', label='Interp.')
plt.legend()
plt.grid()
edited Nov 15 '18 at 14:32
answered Nov 15 '18 at 10:03
AndyKAndyK
1,112918
1,112918
add a comment |
add a comment |
arr2 = arr1[1:] - arr1[:-1]
ind = numpy.where(arr2 < 0)[0]
for i in ind:
arr1[i] = arr1[i - 1] + rate
You may need to replace first any numpy.nan with values, such as numpy.amin(arr1)
add a comment |
arr2 = arr1[1:] - arr1[:-1]
ind = numpy.where(arr2 < 0)[0]
for i in ind:
arr1[i] = arr1[i - 1] + rate
You may need to replace first any numpy.nan with values, such as numpy.amin(arr1)
add a comment |
arr2 = arr1[1:] - arr1[:-1]
ind = numpy.where(arr2 < 0)[0]
for i in ind:
arr1[i] = arr1[i - 1] + rate
You may need to replace first any numpy.nan with values, such as numpy.amin(arr1)
arr2 = arr1[1:] - arr1[:-1]
ind = numpy.where(arr2 < 0)[0]
for i in ind:
arr1[i] = arr1[i - 1] + rate
You may need to replace first any numpy.nan with values, such as numpy.amin(arr1)
answered Nov 15 '18 at 9:08
Patol75Patol75
6236
6236
add a comment |
add a comment |
I would like to avoid for-loops over the entire array.
Frankly speaking, it is hard to achieve no for-loops in numpy, because numpy as the C-made-library uses for-loops implemented in C / C++. And all sorting algorithm (like np.argwhere, np.all, etc.) requires comparisons and therefore also iterations.
Contrarily, I suggest using at least one explicit loop made in Python (iteration is made only once):
arr0 = np.zeros_like(arr1)
num = 1
rate = .2
while(num < len(arr1)):
if arr1[num] < arr1[num-1] or np.isnan(arr1[num]):
start = arr1[num-1]
while(start > arr1[num] or np.isnan(arr1[num])):
print(arr1[num])
arr0[num] = arr0[num-1] + rate
num+=1
continue
arr0[num] = arr1[num]
num +=1
add a comment |
I would like to avoid for-loops over the entire array.
Frankly speaking, it is hard to achieve no for-loops in numpy, because numpy as the C-made-library uses for-loops implemented in C / C++. And all sorting algorithm (like np.argwhere, np.all, etc.) requires comparisons and therefore also iterations.
Contrarily, I suggest using at least one explicit loop made in Python (iteration is made only once):
arr0 = np.zeros_like(arr1)
num = 1
rate = .2
while(num < len(arr1)):
if arr1[num] < arr1[num-1] or np.isnan(arr1[num]):
start = arr1[num-1]
while(start > arr1[num] or np.isnan(arr1[num])):
print(arr1[num])
arr0[num] = arr0[num-1] + rate
num+=1
continue
arr0[num] = arr1[num]
num +=1
add a comment |
I would like to avoid for-loops over the entire array.
Frankly speaking, it is hard to achieve no for-loops in numpy, because numpy as the C-made-library uses for-loops implemented in C / C++. And all sorting algorithm (like np.argwhere, np.all, etc.) requires comparisons and therefore also iterations.
Contrarily, I suggest using at least one explicit loop made in Python (iteration is made only once):
arr0 = np.zeros_like(arr1)
num = 1
rate = .2
while(num < len(arr1)):
if arr1[num] < arr1[num-1] or np.isnan(arr1[num]):
start = arr1[num-1]
while(start > arr1[num] or np.isnan(arr1[num])):
print(arr1[num])
arr0[num] = arr0[num-1] + rate
num+=1
continue
arr0[num] = arr1[num]
num +=1
I would like to avoid for-loops over the entire array.
Frankly speaking, it is hard to achieve no for-loops in numpy, because numpy as the C-made-library uses for-loops implemented in C / C++. And all sorting algorithm (like np.argwhere, np.all, etc.) requires comparisons and therefore also iterations.
Contrarily, I suggest using at least one explicit loop made in Python (iteration is made only once):
arr0 = np.zeros_like(arr1)
num = 1
rate = .2
while(num < len(arr1)):
if arr1[num] < arr1[num-1] or np.isnan(arr1[num]):
start = arr1[num-1]
while(start > arr1[num] or np.isnan(arr1[num])):
print(arr1[num])
arr0[num] = arr0[num-1] + rate
num+=1
continue
arr0[num] = arr1[num]
num +=1
answered Nov 15 '18 at 9:56
artonaartona
71247
71247
add a comment |
add a comment |
Your problem can be expressed in one simple recursive difference equation:
y[n] = max(y[n-1] + 0.2, x[n])
So the direct Python form would be
def func(a):
out = np.zeros_like(a)
out[0] = a[0]
for i in range(1, len(a)):
out[i] = max(out[i-1] + 0.2, a[i])
return out
Unfortunately, this equation is recursive and non-linear, so finding a vectorized algorithm may be difficult.
However, using Numba we can speed up this loop-based algorithm by a factor of 300:
fastfunc = numba.jit(func)
arr1 = np.random.rand(1000000)
%timeit func(arr1)
# 599 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fastfunc(arr1)
# 2.22 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I think OP wants10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not10. , 10.24, 10.48, 10.72, 10.96, 11.2
.
– AndyK
Nov 15 '18 at 10:41
add a comment |
Your problem can be expressed in one simple recursive difference equation:
y[n] = max(y[n-1] + 0.2, x[n])
So the direct Python form would be
def func(a):
out = np.zeros_like(a)
out[0] = a[0]
for i in range(1, len(a)):
out[i] = max(out[i-1] + 0.2, a[i])
return out
Unfortunately, this equation is recursive and non-linear, so finding a vectorized algorithm may be difficult.
However, using Numba we can speed up this loop-based algorithm by a factor of 300:
fastfunc = numba.jit(func)
arr1 = np.random.rand(1000000)
%timeit func(arr1)
# 599 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fastfunc(arr1)
# 2.22 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I think OP wants10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not10. , 10.24, 10.48, 10.72, 10.96, 11.2
.
– AndyK
Nov 15 '18 at 10:41
add a comment |
Your problem can be expressed in one simple recursive difference equation:
y[n] = max(y[n-1] + 0.2, x[n])
So the direct Python form would be
def func(a):
out = np.zeros_like(a)
out[0] = a[0]
for i in range(1, len(a)):
out[i] = max(out[i-1] + 0.2, a[i])
return out
Unfortunately, this equation is recursive and non-linear, so finding a vectorized algorithm may be difficult.
However, using Numba we can speed up this loop-based algorithm by a factor of 300:
fastfunc = numba.jit(func)
arr1 = np.random.rand(1000000)
%timeit func(arr1)
# 599 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fastfunc(arr1)
# 2.22 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Your problem can be expressed in one simple recursive difference equation:
y[n] = max(y[n-1] + 0.2, x[n])
So the direct Python form would be
def func(a):
out = np.zeros_like(a)
out[0] = a[0]
for i in range(1, len(a)):
out[i] = max(out[i-1] + 0.2, a[i])
return out
Unfortunately, this equation is recursive and non-linear, so finding a vectorized algorithm may be difficult.
However, using Numba we can speed up this loop-based algorithm by a factor of 300:
fastfunc = numba.jit(func)
arr1 = np.random.rand(1000000)
%timeit func(arr1)
# 599 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fastfunc(arr1)
# 2.22 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
edited Nov 15 '18 at 14:09
answered Nov 15 '18 at 10:18
Nils WernerNils Werner
18k14163
18k14163
I think OP wants10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not10. , 10.24, 10.48, 10.72, 10.96, 11.2
.
– AndyK
Nov 15 '18 at 10:41
add a comment |
I think OP wants10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not10. , 10.24, 10.48, 10.72, 10.96, 11.2
.
– AndyK
Nov 15 '18 at 10:41
I think OP wants
10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not 10. , 10.24, 10.48, 10.72, 10.96, 11.2
.– AndyK
Nov 15 '18 at 10:41
I think OP wants
10.0, 10.2, 10.4, 10.6, 10.8, 11.2
, not 10. , 10.24, 10.48, 10.72, 10.96, 11.2
.– AndyK
Nov 15 '18 at 10:41
add a comment |
I finally managed to do what I wanted with a while loop.
# data['myvar'] is the original dataset I want to reshape
data['myvar_corrected'] = data['myvar'].values
temp_d = data['myvar'].fillna(0).values*1.0
dtc = np.maximum.accumulate(temp_d)
data.loc[temp_d < np.maximum.accumulate(dtc),'myvar_corrected'] = float('nan')
stay_in_while = True
min_rate = 5/200000/(24*60)
idx_next = 0
while stay_in_while:
df_temp = data.iloc[idx_next:]
if df_tem['myvar'].isnull().sum()>0:
idx_first_nan = df_temp.reset_index().['myvar_corrected'].isnull().argmax()
idx_nan_or = (data_new.index.values==df_temp.index.values[idx_first_nan]).argmax()
x = np.arange(idx_first_nan-1,df_temp.shape[0])
y0 = df_temp.iloc[idx_first_nan-1]['myvar_corrected']
rate_curve = min_rate*x + (y0 - min_rate*(idx_first_nan-1))
damage_m_rate = df_temp.iloc[idx_first_nan-1:]['myvar_corrected']-rate_curve
try:
idx_intercept = (data_new.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()
data_new.iloc[idx_nan_or:idx_intercept]['myvar'] = rate_curve[0:(damage_m_rate.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()-1]
idx_next = idx_intercept + 1
except:
stay_in_while = False
else:
stay_in_while = False
# Finally I have my result stored in data_new['myvar']
In the following picture the result.
Thanks to everybody for the contribution!
add a comment |
I finally managed to do what I wanted with a while loop.
# data['myvar'] is the original dataset I want to reshape
data['myvar_corrected'] = data['myvar'].values
temp_d = data['myvar'].fillna(0).values*1.0
dtc = np.maximum.accumulate(temp_d)
data.loc[temp_d < np.maximum.accumulate(dtc),'myvar_corrected'] = float('nan')
stay_in_while = True
min_rate = 5/200000/(24*60)
idx_next = 0
while stay_in_while:
df_temp = data.iloc[idx_next:]
if df_tem['myvar'].isnull().sum()>0:
idx_first_nan = df_temp.reset_index().['myvar_corrected'].isnull().argmax()
idx_nan_or = (data_new.index.values==df_temp.index.values[idx_first_nan]).argmax()
x = np.arange(idx_first_nan-1,df_temp.shape[0])
y0 = df_temp.iloc[idx_first_nan-1]['myvar_corrected']
rate_curve = min_rate*x + (y0 - min_rate*(idx_first_nan-1))
damage_m_rate = df_temp.iloc[idx_first_nan-1:]['myvar_corrected']-rate_curve
try:
idx_intercept = (data_new.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()
data_new.iloc[idx_nan_or:idx_intercept]['myvar'] = rate_curve[0:(damage_m_rate.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()-1]
idx_next = idx_intercept + 1
except:
stay_in_while = False
else:
stay_in_while = False
# Finally I have my result stored in data_new['myvar']
In the following picture the result.
Thanks to everybody for the contribution!
add a comment |
I finally managed to do what I wanted with a while loop.
# data['myvar'] is the original dataset I want to reshape
data['myvar_corrected'] = data['myvar'].values
temp_d = data['myvar'].fillna(0).values*1.0
dtc = np.maximum.accumulate(temp_d)
data.loc[temp_d < np.maximum.accumulate(dtc),'myvar_corrected'] = float('nan')
stay_in_while = True
min_rate = 5/200000/(24*60)
idx_next = 0
while stay_in_while:
df_temp = data.iloc[idx_next:]
if df_tem['myvar'].isnull().sum()>0:
idx_first_nan = df_temp.reset_index().['myvar_corrected'].isnull().argmax()
idx_nan_or = (data_new.index.values==df_temp.index.values[idx_first_nan]).argmax()
x = np.arange(idx_first_nan-1,df_temp.shape[0])
y0 = df_temp.iloc[idx_first_nan-1]['myvar_corrected']
rate_curve = min_rate*x + (y0 - min_rate*(idx_first_nan-1))
damage_m_rate = df_temp.iloc[idx_first_nan-1:]['myvar_corrected']-rate_curve
try:
idx_intercept = (data_new.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()
data_new.iloc[idx_nan_or:idx_intercept]['myvar'] = rate_curve[0:(damage_m_rate.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()-1]
idx_next = idx_intercept + 1
except:
stay_in_while = False
else:
stay_in_while = False
# Finally I have my result stored in data_new['myvar']
In the following picture the result.
Thanks to everybody for the contribution!
I finally managed to do what I wanted with a while loop.
# data['myvar'] is the original dataset I want to reshape
data['myvar_corrected'] = data['myvar'].values
temp_d = data['myvar'].fillna(0).values*1.0
dtc = np.maximum.accumulate(temp_d)
data.loc[temp_d < np.maximum.accumulate(dtc),'myvar_corrected'] = float('nan')
stay_in_while = True
min_rate = 5/200000/(24*60)
idx_next = 0
while stay_in_while:
df_temp = data.iloc[idx_next:]
if df_tem['myvar'].isnull().sum()>0:
idx_first_nan = df_temp.reset_index().['myvar_corrected'].isnull().argmax()
idx_nan_or = (data_new.index.values==df_temp.index.values[idx_first_nan]).argmax()
x = np.arange(idx_first_nan-1,df_temp.shape[0])
y0 = df_temp.iloc[idx_first_nan-1]['myvar_corrected']
rate_curve = min_rate*x + (y0 - min_rate*(idx_first_nan-1))
damage_m_rate = df_temp.iloc[idx_first_nan-1:]['myvar_corrected']-rate_curve
try:
idx_intercept = (data_new.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()
data_new.iloc[idx_nan_or:idx_intercept]['myvar'] = rate_curve[0:(damage_m_rate.index.values==damage_m_rate[damage_m_rate>0].index.values[0]).argmax()-1]
idx_next = idx_intercept + 1
except:
stay_in_while = False
else:
stay_in_while = False
# Finally I have my result stored in data_new['myvar']
In the following picture the result.
Thanks to everybody for the contribution!
answered Nov 19 '18 at 8:35
Giuseppe SalernoGiuseppe Salerno
354
354
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53315664%2freshape-numpy-array-with-minimum-rate%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Shift a copy of the array by one sample and subtract that from the original to find where one sample is less than the next.
– Mark Setchell
Nov 15 '18 at 9:10
You should change "reshape" to "interpolate", as "reshaping" is something entirely unrelated to this.
– Nils Werner
Nov 15 '18 at 10:23
What if rate = 0.3? Then
arr2 = np.array([0, 1, 2, 3, 4, 4, 4, 4.3, 4.6, 4.9, 5.2, 5, ...])
. As you can see it is not monotonically increasing. So you need to check this... Why not just skip the unwanted points and connect the successive increasing points with a line, e.g.arr1[6] = 4
witharr1[11] = 5
,arr1[13] = 10
witharr1[18] = 11.2
?– AndyK
Nov 15 '18 at 10:31
@AndyK I con connect the points as you suggest, but I need to find the a connecting point is above a line starting from the first point and with a constant rate (I'll update the question to be clearer).
– Giuseppe Salerno
Nov 15 '18 at 11:06
Why would your expected output to interpolate between
x[20]
andx[29]
, whenx[26] = x[20]
andx[27] > x[20]
? Do you expectx[n] > x[n-1] + 0.2
?– Nils Werner
Nov 15 '18 at 12:44