2024年1月11日发(作者:)

SPOOL 号码溢出导致作业失败分析

1. 事件发生时间及现象:

3月7号凌晨00:03检查到Z411K后台作业有两个失败,检查原因是spool_internal_error,查看详细报错信息是spool overflow,然后我检查相应的更新情况为无更新记录。马上给孙斌和马传兵打电话,但是联系不到两人。00:09时检查其他作业出现大量的Z411K失败作业,部分原因为spool_internal_error,另一部分失败原因为系统例外情况error_message。检查MRP作业发现全部失败,失败原因为spool_internal_error,联系了李翔告知了该情况。之后给崔鹏和吴永安打电话,在他们的协助下使用spad 删除了7天前的old request。李翔重新调度了MRP作业,1点53分MRP全部完成,恢复正常。在无法联系到孙斌和马传兵的情况下,联系了王佳宁,由她重新调度Z411K后台作业。期间联系到丁永琴将相应的后台作业用户改为前台用户以便重新调度作业。4点30分左右,检查Z411K全部完成,恢复正常。

2. 事件影响:

出现大量失败后台作业,包括MRP,411K 等重要作业。

3. 解决办法:

手工删除部分spool request后,对失败作业重新调度后正常。

4. 原因分析:

造成本次作业失败的主要原因是用户安排的作业中大量使用spool

log 导致spool number range 溢出,进而导致作业失败。

后续根治办法:

1. 尽量少在后台作业中使用spool log 记录详细日志(可以记录summary 日志),如确需记录请考虑使用日志表。需要业务部门和我们一起确认后,请系统开发组调整程序实现。

2. 在修改号段允许的情况下,可以参考notes 48284 修改spool

number range。需要业务部门与我们一起评估,然后进行参数修改。

附:sap notes 和当时用户的spool 数量

后台作业用户的spool数量相当大

Note 48284 - System can no longer create spool requests

Summary

Symptom

Creating spool requests takes a long time. Finally, the system cannot

create any more requests and the short dump SPOOL_INTERNAL_ERROR occurs.

In the dump itself and in the syslog, the system issues the message "Spool

full" or "Spool overflow".

Other terms

spool full, spool overflow, FBN

Reason and Prerequisites

In the standard SAP system, the number of spool requests that can be

created is limited to 32000. If you reach this limit, there are no more

free numbers and the errors described above occur.

Solution

You can raise the upper limit for spool requests. As of Release 4.0, you

can set the upper limit to anywhere between 2 and 31 numbers (previously

99,000). However, we recommend that you do not set the interval higher

than 999,999 because the human user finds higher numbers difficult to

process.

Proceed as follows:

1. Log on to the system in client 000 and call transaction SNRO.

2. In the "To No." column, change the upper limit of the interval

SPO_NUM to 999,999.

The size of the interval also determines the maximum number of spool

requests that can exist in the system. To ensure that the system

performance does not deteriorate, you must use RSPO0041 or RSPO1041 on

a regular basis to delete spool requests that are no longer required. The

number of spool requests that can be held "officially" in the system

depends to a great extent on the capacity of the database and the database

computer. Only the number of spool requests simultaneously held in the

system is relevant, not the size of the number intervals.

You can use the spool number monitor in transaction RZ20 to specify

threshold values in which the system must create an alert if a certain

percentage of the spool numbers are allocated.