最近搬砖时发现,Python DataFrame 在进行一些比较复杂的条件判断时,会出现如下报错:
ValueError: The truth value of a Series? is ?ambiguous. Use a.empty, a. bool (), a.item(), a. any ()? or ?a. all ().
我的实例比较复杂,共涉及四列(['apple', 'banana', 'cat', 'dog'])的数值判断,夹杂或or、与and,生成新的一列['egg'],利用def定义函数的方法,最后成功了,没有报错!
def function(a,b,c,d):
if a>16 or (a<10 and b>=85):
if c < 40:
return -4
elif c >= 40 and c < 45:
return -3
elif c >= 45 and c < 55:
return -2
elif c >= 55 and c < 60:
return -1
elif c >= 60 and c < 65:
return 0
elif c >= 65 and c < 70:
return 1
elif c >= 70 and c < 75:
return 2
elif c >= 75 and c < 80:
return 3
else:
return 4
elif (a>=10 and a<=16) or (a<10 and b<85):
if d <=-1000:
return -4
elif d > -1000 and d <= -800:
return -3
elif d > -800 and d <= -600:
return-2
elif d > -600 and d <= -300:
return -1
elif d > -300 and d <= -200:
return 0
elif d > -200 and d <= -50:
return 1
elif d > -50 and d <= 80:
return 2
elif d > 80 and d<= 160:
return 3
else:
return 4
else:
print('错误')
然后用dataframe生成新的一列,将具体的列名赋值给a,b,c,d
dt['egg'] = dt.apply(lambda x: function(x['apple'],x['banana'],x['cat'],x['dog']),axis=1)
|